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Abstract 


Theoretically,  multiple-input  multiple-output  (MEMO)  wireless  systems  can  achieve  remarkably 
high  spectral  efficiency  as  compared  to  conventional,  single-antenna  systems.  This  report  identi¬ 
fies  a  number  of  problems  which  need  to  be  solved  in  order  to  implement  practical  MIMO  systems: 
channel  estimation,  correlated  fading,  slow  fading,  asynchronous  reception,  and  frequency-selective 
fading.  The  effects  of  these  non-ideal  conditions  on  the  performance  of  MIMO  systems  are  eval¬ 
uated,  and  directions  are  explored  in  which  solutions  may  be  found.  The  focus  of  the  report  is 
on  MIMO  systems  employing  an  iterative  (“turbo”)  receiver.  The  results  presented  are  based  on 
the  iterative  tree  search  (ITS)  detection  scheme  developed  recently  at  CRC,  but  are  expected  to  be 
typical  of  most  iterative  detectors  that  have  been  proposed  in  the  MIMO  literature.  The  main  con¬ 
clusion  of  the  report  is  that  iterative  channel  estimation  and  further  development  of  an  ITS-based 
detection  scheme  for  asynchronous  and  wideband  reception  are  the  two  most  promising  topics  for 
future  research  in  this  area. 


Resume 


En  principe.  les  systemes  sans  fil  multientrees  et  multisorties  (MIMO)  permettent  une  exploitation 
remarquable  du  spectre  comparativement  aux  systemes  traditionnels  a  antenne  unique.  Le  present 
rapport  recense  un  certain  nombre  de  problemes  a  resoudre  avant  de  mettre  en  place  des  systbmes 
MIMO  fonctionnels  :  estimation  des  voies,  evanouissement  correle,  evanouissement  lent,  reception 
asynchrone  et  evanouissement  progressif  des  frequences.  II  evalue  les  effets  de  ces  conditions  im- 
parfaites  sur  le  rendement  des  systemes  MIMO  et  examine  des  orientations  qui  pourraient  donner 
des  solutions.  Le  rapport  se  concentre  sur  les  systemes  MIMO  qui  utilisent  un  recepteur  iteratif 
(«turbo»).  Les  resultats  presentes  s’appuient  sur  la  formule  de  detection  de  recherche  arborescente 
iterative  (RAI)  elaboree  dernierement  au  Centre  de  recherches  sur  les  communications,  mais  de- 
vraient  ressembler  a  ceux  de  la  majorite  des  detecteurs  iteratifs  proposes  dans  la  documentation 
sur  les  systemes  MIMO.  Le  rapport  conclut  notamment  que  l’estimation  iterative  des  voies  et  le 
perfectionnement  d'une  formule  de  detection  RAI  pour  la  reception  asynchrone  et  a  large  bande 
constituent  les  deux  principaux  sujets  les  plus  prometteurs  pour  les  futures  recherches  dans  le  do- 
maine. 
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Executive  summary 


This  report  addresses  the  use  of  iterative  detection  in  real-world  multiple-input  multiple-output 
(MIMO)  wireless  systems,  which  are  theoretically  capable  of  achieving  remarkably  high  spectral 
efficiency  through  spatial  multiplexing.  A  number  of  issues  are  identified  which  need  to  be  over¬ 
come  in  order  to  implement  practical  MEMO  systems,  as  discussed  below.  The  focus  of  this  report 
is  on  MEMO  systems  employing  iterative  (“turbo”)  receivers.  The  results  presented  herein  are  based 
on  the  iterative  tree  search  (ITS)  detection  scheme  developed  recently  at  CRC,  but  are  expected  to 
be  typical  of  most  iterative  MIMO  detection  schemes. 

Channel  estimation  provides  the  detector  with  estimates  of  the  MIMO  channel  matrix  and  the 
noise  variance.  If  channel  variations  are  sufficiently  slow,  channel  training  by  means  of  the  trans¬ 
mission  of  known  pilot  symbol  vectors  results  in  negligible  performance  loss  compared  to  the  hy¬ 
pothetical  case  of  perfect  channel  knowledge,  which  is  commonly  assumed  to  be  available.  If  the 
channel  varies  rapidly,  the  use  of  soft  decision  feedback  is  recommended  in  order  to  keep  the  train¬ 
ing  overhead  acceptable.  A  method  is  discussed  which  is  expected  to  overcome  the  vulnerability  of 
existing  soft  decision  feedback  schemes  to  unreliable  feedback.  Correlated  fading  is  well-known 
to  degrade  information-theoretic  MIMO  capacity,  and,  in  addition,  to  decrease  the  efficiency  of 
sub-optimal  detection  schemes  such  as  ITS  in  approaching  this  theoretical  limit.  Although  perfor¬ 
mance  loss  is  inevitable  in  severely  correlated  fading,  it  is  shown  that  the  efficiency  of  ITS  detection 
under  such  conditions  can  possibly  be  improved  through  asynchronous  transmission,  as  discussed 
below.  Slow  fading,  which  is  most  likely  to  occur  in  indoor  applications,  enables  accurate  channel 
estimation  with  low  training  overhead,  but  is  also  shown  to  lead  to  performance  degradation  due  to 
lack  of  temporal  diversity.  Asynchronous  reception,  which  can  occur,  for  example,  in  the  downlink 
of  networks  with  geographically  distributed  transmit  arrays,  precludes  the  use  of  existing  MIMO 
detection  schemes.  It  is  possible  to  modify  the  ITS  detector  such  that  it  can  be  applied  in  asyn¬ 
chronous  scenarios.  The  modified  scheme,  called  A-ITS,  also  enables  intentionally  asynchronous 
transmission,  which,  interestingly,  results  in  considerable  performance  improvement.  Because  it 
is  inherently  capable  of  dealing  with  channels  with  memory,  it  is  expected  that  A-ITS  detection  is 
also  suitable  for  frequency-selective  fading  channels.  Frequency  diversity  gains  can  possibly  com¬ 
pensate  for  performance  loss  due  to  low  temporal  diversity,  or  can  be  traded  off  for  lower  receiver 
complexity. 

The  two  most  promising  topics  for  future  research  concerning  ITS-based  MIMO  detection  are 
iterative  channel  estimation  and  the  further  development  of  the  A-ITS  scheme.  The  development 
of  a  suitable  channel  estimation  scheme  will  remove  the  main  obstacle  in  the  implementation  of 
practical  MIMO  systems  based  on  the  ITS  detector.  The  use  of  A-ITS  instead  of  synchronous  ITS 
detection  will  increase  the  range  of  applications  of  ITS-based  detection  to  include  wideband  and 
asynchronous  scenarios,  as  well  as  schemes  in  which  the  number  of  transmit  antennas  exceeds  the 
number  of  receive  antennas,  such  as  transmit  diversity.  Special  attention  will  have  to  be  paid  to  the 
relatively  unexplored  area  of  wideband  and  asynchronous  channel  estimation. 

De  Jong,  Y.L.C.  2003.  On  the  implementation  of  iterative  detection  in  real-world  MIMO  wire¬ 
less  systems.  DRDC  Ottawa  TR  2003-242.  CRC-RP-2003-009.  Communications  Research  Centre 
Canada. 
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Sommaire 


Le  present  rapport  porte  sur  le  recours  a  la  detection  iterative  dans  les  systemes  sans  fil  multientrees 
et  multisorties  (MIMO)  du  monde  reel  qui  permettent,  en  principe,  une  exploitation  remarquable  du 
spectre  au  moven  du  multiplexage  spatial.  II  cible  un  certain  nombre  de  problemes  a  resoudre  avant 
la  mise  en  place  de  syst&mes  MIMO  fonctionnels,  dont  il  est  question  ci-dessous.  Le  rapport  se 
concentre  sur  les  systemes  MIMO  qui  utilisent  un  rdcepteur  iteratif  («turbo»).  Les  resultats  presen¬ 
ts  dans  ce  rapport  s’appuient  sur  la  formule  de  detection  de  recherche  arborescente  iterative  (RAI) 
elaboree  dernierement  au  Centre  de  recherches  sur  les  communications,  mais  devraient  ressembler 
a  ceux  de  la  majorite  des  formules  de  detection  MIMO  iterative. 

L 'estimation  des  voies  fournit  au  detecteur  des  estimations  de  la  matrice  des  voies  MIMO  et 
de  la  variation  du  bruit.  Si  les  variations  des  voies  sont  suffisamment  lentes,  la  formation  des 
voies  par  la  transmission  des  vecteurs  symboliques  pilotes  connus  cause  une  perte  de  rendement 
negligeable  comparativement  au  cas  hypothetique  des  connaissances  des  voies  parfaites,  dont  la 
disponibilite  est  habituellement  presumee.  Si  la  voie  varie  rapidement,  il  est  recommande  d’utiliser 
une  contre-reaction  a  decision  douce  afin  de  conserver  le  niveau  acceptable  du  surdebit  de  forma¬ 
tion.  Des  discussions  portent  sur  une  methode  qui  devrait  permettre  d’eliminer  la  vulnerabilite  des 
formules  existantes  de  contre-reaction  a  decision  douce  pour  la  contre-reaction  peu  fiable.  Il  est 
notoire  que  Vevanouissement  correle  degrade  la  capacite  MIMO  theorique  informationnelle  et,  en 
outre,  diminue  l’efficacite  des  formules  de  detection  sous-optimales  comme  la  recherche  arbores¬ 
cente  iterative  (RAI)  a  l’approche  de  cette  limite  theorique.  Bien  que  la  perte  de  rendement  soit  in¬ 
evitable  lors  d'un  evanouissement  fortement  correle,  des  experiences  ont  demontre  qu’il  est  possible 
d'ameliorer  l'efficacite  de  la  detection  RAI  dans  de  telies  conditions  par  la  transmission  asynchrone, 
tel  qu'indique  ci-dessous.  Vevanouissement  lent,  qui  survient  tres  probablement  dans  les  applica¬ 
tions  interieurcs,  permet  une  estimation  precise  des  voies  avec  un  faible  surdebit  de  formation,  mais 
cause  aussi  une  degradation  du  rendement  en  raison  du  manque  de  diversity  temporelle.  La  recep¬ 
tion  asynchrone.  qui  survient  notamment  avec  la  liaison  descendante  des  reseaux  munis  d’ensembles 
de  transmission  geographiquement  disperses,  empeche  (’utilisation  des  formules  actuelles  de  detec¬ 
tion  MIMO.  Il  est  possible  de  transformer  le  detecteur  RAI  de  mani&re  a  I’appliquer  a  des  scenarios 
asynchrones.  La  formule  modifiee,  appelee  «A-ITS»,  autorise  la  transmission  intentionnellement 
asynchrone.  qui,  de  fagon  interessante,  ameliore  considerablement  le  rendement.  Puisqu’elle  est 
capable  de  traiter  de  fag  on  inherente  avec  des  voies  ayant  de  la  memoire,  la  detection  A-ITS  devrait 
aussi  convenir  aux  voies  d 'evanouissement  progressif  des  frequences.  Les  gains  de  la  diversite  de 
frequence  peuvent  compenser  la  perte  de  rendement  causee  par  la  faible  diversite  temporelle  ou  etre 
echanges  pour  une  complexity  de  recepteur  plus  basse. 

Les  deux  principaux  sujets  les  plus  prometteurs  pour  les  futures  recherches  sur  la  detection 
MIMO  axee  sur  la  RAI  sont  l’estimation  iterative  des  voies  et  le  perfectionnement  de  la  formule  A- 
ITS.  L'elaboration  d'une  formule  d’estimation  des  voies  pertinente  eliminera  le  principal  obstacle 
de  la  mise  en  place  de  systemes  MIMO  fonctionnels  fondes  sur  le  detecteur  RAI.  L’utilisation  de 
la  formule  A-ITS  au  lieu  de  la  detection  RAI  asynchrone  fera  croitre  la  gamme  d’applications  de 
detection  RAI  afin  d'inciure  les  scenarios  asynchrones  et  a  large  bande  et  augmenter  les  formules 
pour  lesquelles  le  nombre  d’antennes  de  transmission  excede  le  nombre  d’antennes  de  reception, 
comme  la  diversite  de  transmission.  Le  domaine  relativement  inexplore  de  l’estimation  des  voies 
asynchrones  et  a  large  bande  aura  droit  a  une  attention  particuliere. 
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1  Introduction 


In  recent  years  there  has  been  much  interest  in  wireless  communication  systems  employing  multi¬ 
ple  antennas  at  both  the  transmitter  and  the  receiver.  This  class  of  systems,  commonly  referred  to 
as  multiple-input  multiple-output  (MIMO),  has  the  potential  of  achieving  remarkably  high  spectral 
efficiency  in  rich  multipath  environments.  The  use  of  space-time  bit-interleaved  coded  modula¬ 
tion  (ST-BICM)  in  conjunction  with  an  iterative  receiver,  in  which  soft-input  soft-output  detection 
and  decoding  stages  exchange  reliability  information  on  the  transmitted  bits  in  an  iterative  fashion, 
has  been  shown  to  yield  particularly  good  performance  [1-7].  Under  certain  circumstances,  e.g., 
turbo  coding,  rapid  channel  fading,  and  large  interleaver  depth,  such  systems  can  even  approach  the 
ergodic  MIMO  capacity  limit  [3]. 

As  increased  spectral  efficiency  is  the  primary  reason  for  employing  MIMO  wireless  systems, 
the  use  of  a  large  number  of  antennas  and  high-order  modulation  such  as  64-QAM  is  of  special 
interest.  Unfortunately,  a  problem  with  the  optimum  soft-input  soft-output  detector,  the  maximum 
a  posteriori  probability  (MAP)  detector,  is  that  its  computational  complexity  is  exponential  in  the 
number  of  bits  transmitted  simultaneously  in  each  symbol  interval,  which  makes  its  application 
infeasible  even  for  a  moderate  number  of  transmit  antennas  and/or  modulation  order.  In  order  to 
solve  this  problem,  several  suboptimum,  reduced-complexity  detection  schemes  have  been  pro¬ 
posed;  among  them  are  the  soft-cancellation  minimum  mean  squared  error  (SC-MMSE)  detector 
[4,6,7],  the  list  sphere  detector  [3],  and  the  iterative  tree  search  (ITS)  detector  [1,2,8]  developed 
at  CRC.  The  complexity  per  bit  of  the  ITS  detector  is  only  linear  in  Nlt  and,  if  a  special  type  of 
bit  mapping  called  multilevel  mapping  is  used,  roughly  independent  of  the  modulation  order.  It 
has  been  demonstrated  that,  for  the  same  computational  complexity,  the  error  performance  of  the 
ITS  detector  is  comparable  to  or  better  than  that  of  SC-MMSE  and  list  sphere  detection  even  for 
relatively  small  list  sizes.  This  report  focuses  on  the  use  of  the  ITS  scheme. 

In  real-world  MIMO  systems,  channel  conditions  are  generally  not  as  favourable  as  is  often 
assumed  for  the  sake  of  tractability  and,  also,  knowledge  of  the  channel  state  is  not  available  a  priori, 
but  must  be  estimated  using  a  suitable  channel  estimation  scheme.  Multipath  propagation  may  lead 
to  frequency-selective  fading  and,  hence,  inter-symbol  interference,  and  the  signals  transmitted  from 
different  antennas  may  not  be  synchronized  at  the  receiver.  Furthermore,  reduced  diversity  due  to 
correlated  and/or  slow  fading  is  known  to  degrade  the  performance  of  MIMO  systems.  The  effects 
of  such  non-ideal  conditions  on  the  performance  of  MIMO  systems  employing  ITS  detection,  and 
the  question  of  how  to  deal  with  them  in  practical  system  implementations,  have  not  been  addressed 
to  date.  These  issues  are  the  topic  of  this  report.  The  main  aim  herein  is  to  identify  problems 
that  need  to  be  overcome  in  order  to  implement  a  practical  ST-BICM  MIMO  system  based  on  ITS 
detection,  and  to  explore  directions  in  which  solutions  to  these  problems  may  be  found. 

The  outline  of  the  report  is  as  follows.  Section  2  reviews  ST-BICM,  iterative  processing  and 
MAP  detection,  and  introduces  notation  that  will  be  used  in  the  rest  of  the  report.  The  ITS  detection 
scheme  and  its  complexity  and  performance  under  ideal  channel  conditions  are  reviewed  in  Sec¬ 
tion  3.  Section  4  contains  the  main  contributions  of  this  report.  It  identifies  a  number  of  issues  that 
can  impact  practical  MIMO  systems  based  on  ITS  detection,  provides  an  evaluation  of  their  effects, 
and  discusses  approaches  to  deal  with  these  issues.  Final  conclusions  are  drawn  in  Section  5. 
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2  Review  of  iterative  MIMO  detection  and  decoding 


2.1  ST-BICM 

Consider  a  MIMO  system  with  N,  transmit  antennas  and  Nr  receive  antennas  (“N,  x  Nr  MIMO”), 
employing  ST-BICM  and  an  iterative  receiver  (see  Fig.  2.1),  and  assume  that  Nr  >  N,.  Information 
bits  are  encoded,  interleaved,  serial-to-parallel  converted  and  then  mapped  onto  successive  symbol 
vectors  s  =  [.«!,•••  ,  ^v,]r  by  the  MIMO  symbol  vector  mapper.  The  channel  code  can  be  an 
off-the-shelf  code,  e.g.,  a  convolutional  or  turbo  code.  The  modulation  format  is  identical  for  all 
transmit  antennas,  and  the  number  of  bits  per  constellation  point  is  denoted  by  Mc.  In  Fig.  2.1,  the 
vector 

x  =  [*1,1.  ' '  •  >XitMc,X2,l,  ■  •  •  ,X.N,,McY  (2-1) 

represents  a  block  of  N,MC  code  bits  from  the  output  of  the  interleaver  that  is  subsequently  serial- 
to-parallel  converted  to  N,  subblocks 


xn  —  [Xn,U  ’ '  '  ,  Xn,Mc\  >  (2-2) 

n  =  1,  ■  •  •  ,  iV(,  and  then  mapped  onto  s  according  to  sn  =  map(x„).  Here,  xnk  denotes  the  fcth 
bit  mapped  onto  the  nth  symbol.  For  later  convenience,  the  binary  alphabet  is  defined  by  the  set 
{-!,+!}. 

In  the  following  it  is  assumed  that  the  channel,  denoted  by  the  Nr  x  N,  matrix  H,  is  random,  fre¬ 
quency  non-selective  over  the  bandwidth  of  interest,  and  known  perfectly  at  the  receiver.  The  more 
realistic  situation  where  only  an  imperfect  estimate  of  H  is  available  at  the  receiver  is  considered  in 
Section  4.1.  It  is  also  assumed  that  H  is  full  rank  with  probability  one.  The  received  signal  vector 
is  written  as 

y  =  Hs  +  n,  (2.3) 

where  n  is  an  additive  noise  vector  whose  elements  are  independent,  complex-valued  Gaussian 
variables  with  zero  mean  and  variance  a „2.  Eq.  (2.3)  represents  a  single  use  of  the  channel,  which 
corresponds  to  the  transmission  of  N,MC  code  bits.  The  channel  is  accessed  repeatedly  to  transmit 
a  frame  of  bits  typically  spanning  many  vectors  x,  and  may  vary  from  one  channel  use  to  the  next. 
This  is  referred  to  as  a  block  fading  channel  model  in  [5],  The  depth  of  the  interleaver  is  equal  to 
the  frame  size,  and  is  therefore  typically  much  larger  than  N,MC. 

2.2  Iterative  processing 

Iterative  processing  is  a  technique  in  which  reliability  information  on  the  transmitted  bits  is  ex¬ 
changed  between  the  detector  and  the  channel  decoder  in  an  iterative  manner,  such  that  the  decisions 
made  by  the  channel  decoder  are  improved  by  the  detection  process  and  vice  versa.  The  detection 
and  decoding  stages  in  an  iterative  receiver  must  be  capable  of  both  accepting  and  producing  soft 
information,  and  are  therefore  referred  to  as  being  soft-input  soft-output.  A  block  diagram  of  an 
iterative  MIMO  receiver  is  shown  in  the  lower  part  of  Fig.  2.1.  The  task  of  the  MIMO  detector  is  to 
provide  updates,  commonly  called  “extrinsic”  information,  L  e,  on  the  soft-decision  estimates  of  the 
code/channel  bits.  The  computation  of  L  e  is  based  on  the  received  signal  vector,  y,  and  any  soft- 
decision  bit  information  that  may  already  exist.  The  latter  information,  called  a  priori  information 
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Figure  2.1.  Block  diagram  of  a  MIMO  system  employing  ST-BICM  and  an  iterative  receiver,  n  and  n_1 
denote  interleaving  and  deinterleaving,  respectively.  /.(•;  /)  and  /.(■;  O )  denote  the  soft  inputs  and  outputs  of 
the  channel  decoder,  respectively.  The  letters  c  and  u  refer  to  coded  and  uncoded  bits,  respectively. 


and  denoted  by  LA,  becomes  available  at  the  output  of  the  soft-input  soft-output  channel  decoder 
after  the  first  iteration,  from  where  it  is  fed  back  to  the  detector  input  after  being  re-interleaved.  The 
channel  decoder  is  an  algorithm  that  performs  an  update  of  the  so-called  a  posteriori  probabilities 
(APPs)  of  both  the  information  and  the  coded  bits  based  on  the  code  constraint.  In  the  case  of  con¬ 
volutional  coding,  the  decoding  algorithm  that  is  optimum  in  the  sense  that  it  maximizes  the  APP 
of  each  bit  is  the  symbol-by-symbol  MAP  decoder,  which  is  implemented  efficiently  in  the  BCJR 
algorithm  [9],  In  the  case  of  turbo  coding,  which  involves  the  parallel  concatenation  of  two  convo¬ 
lutional  codes,  the  decoder  is  implemented  by  iterating  the  two  symbol-by-symbol  MAP  decoders 
of  the  constituent  codes  [10],  Note  that  the  receiver  in  an  iterative  MIMO  system  employing  turbo 
coding  performs  two  nested  iterative  procedures,  the  inner  one  associated  with  the  turbo  decoder 
and  the  outer  one  with  the  iterative  detection  and  decoding  process. 

Generally,  increasing  the  number  of  iterations  in  the  detector/decoder  loop  improves  the  perfor¬ 
mance  of  an  iterative  receiver,  but  also  increases  its  complexity  proportionally.  The  optimum  choice 
of  the  number  of  iterations  is  therefore  determined  by  the  receiver's  convergence  behaviour.  This 
behaviour  has  been  investigated  experimentally  for  the  soft-cancellation  minimum  mean  squared 
error  (SC-MMSE)  detector  [4, 6]  and  the  MAP  detector  [5],  discussed  in  Section  2.3.  The  results 
of  these  investigations  are  difficult  to  compare,  because  different  numbers  of  transmit  and  receive 
antennas,  channel  coding  schemes,  interleavers,  etc.,  were  used.  In  all  cases,  however,  complete 
convergence  was  reached  after  six  or  fewer  iterations,  while  most  of  the  gain  with  respect  to  non¬ 
iterative  detection  and  decoding  was  obtained  in  the  first  two  iterations,  with  diminishing  gains  in 
later  iterations. 

Theoretical  studies  concerning  the  convergence  behaviour  of  iterative  decoders  and  single¬ 
antenna  iterative  detection  and  decoding  schemes  have  been  reported  in  [11, 12],  and  were  later 
extended  for  MIMO  systems  [13].  However,  the  method  employed  in  these  studies  is  based  on 
an  empirical  characterization  of  the  detector  and  decoder  modules,  by  means  of  computer  simula¬ 
tions.  Although  this  method  has  been  demonstrated  to  provide  useful  design  guidelines  for  iterative 
systems,  it  does  not  provide  any  fundamental  understanding  as  to  how  the  design  of  the  detector 
can  improve  the  convergence  properties  of  iterative  MIMO  receivers.  The  effect  of  the  number 
of  iterations  on  the  error  performance  of  an  ST-BICM  system  employing  the  ITS  detector  will  be 
investigated  by  computer  simulations  in  Section  3.5. 
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2.3  MAP  detection 


The  optimum  detector  in  an  iterative  MIMO  receiver  is  well  known  to  be  the  MAP  detector  [3], 
sometimes  also  called  APP  detector.  This  detector  computes  extrinsic  reliability  information  on  the 
channel  bits,  expressed  as  log-likelihood  ratios  (LLRs),  as 

E  exP  A*(s) 

xeX,Ti 

LEixn<k)  =  log  — - —  -  LA(xn,k). 

E  exP  M(s) 

*sKk 

(2.4) 

In  this  expression,  log(-)  denotes  the  natural  logarithm,  and  and  X“* 

bit  sequences  x  for  which  is  +1  and  -1,  respectively,  i.e., 

are  the  sets  of  all  possible 

x±i  =  {*K*  =  ±1}- 

(2.5) 

The  metric  /i(s)  in  (2.4)  is  given  by 

1  o  N' 

fl(s)  =  ||y  -  Hs  || 2  +  Lx(Xi.j), 

"  i= i  jeh 

(2.6) 

where 

J;  =  {/'|  7  e  { Mc]  and Xi.j  =  +l}. 

(2.7) 

The  complexity  of  this  detector  is  proportional  to  the  number  of  different  bit  sequences  contained  in 
X*j.  and  X"}.,  which  is  equal  to  lN,Mc .  This  property  makes  the  use  of  MAP  detection  impractical, 
especially  if  the  number  of  transmit  antennas  and/or  the  signal  constellation  size  are  large. 
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3  Iterative  tree  search  detection 


The  suboptimum  ITS  detection  scheme  proposed  herein  has  a  much  lower  complexity  than  the  MAP 
detector  of  Section  2.3  as  it  evaluates  only  the  bit  sequences  x  that  contribute  significantly  to  the 
detector  output  (2.4),  i.e.,  those  for  which  the  metric  /x(s)  is  “large”.  To  this  end,  in  each  receiver 
iteration,  a  list  of  good  candidate  bit  sequences  (and  their  corresponding  metrics)  is  generated  prior 
to  the  computation  of  the  extrinsic  information,  with  the  aid  of  a  breadth-first  tree  search  algorithm 
known  as  the  M-algorithm  [14-16],  A  detailed  description  of  the  basic  ITS  detection  scheme  fol¬ 
lows  in  Section  3.1.  For  high-order  QAM  constellations,  the  complexity  of  the  ITS  detector  can  be 
reduced  further  by  using  a  multilevel  bit  mapping.  This  is  the  subject  of  Section  3.2. 

3.1  Basic  scheme 

The  squared  vector  norm  in  (2.6)  can  be  rewritten  as 

||y  -  Hs|| 2  =  (s  -  s)tHtH(s  —  s),  (3.1) 

where s  =  [.?,,  •  •  •  ,  sNl]T  =  (iTH)1^  y  is  the  unconstrained  maximum-likelihood  (ML)  solution 
[3].  The  superscript  t  denotes  conjugate  transpose.  Because  H*H  is  Hermitian  and  positive-definite, 
it  has  a  Cholesky  decomposition  FLH  =  I/L,  in  which  L  =  [/,;]  is  an  N,  x  N,  lower  triangular 
matrix  with  real,  positive  diagonal  entries.  The  metric  /x(s)  can  now  be  rewritten  as 

1  N‘  *_1  2  N,  _ 

/x(s)  =  (3-2) 
"  ‘=1  7=1  i=l  7'eJj 

This  metric  can  be  computed  in  a  symbol-by-symbol  manner,  starting  with  the  first  symbol  si  and 
proceeding  to  sNl,  by  exploiting  the  relationships 

Mi  =  — -2|/nC?i  -!i)|a  +  2>CrM) 

JeJi 

l  A  i_1 

Ml  Mi  — 1  2  ^i)  +  'y  '  h j (.S j  *ty) 

°n  7=1 

M(s)  =  flNr 

This  property,  which  hinges  on  the  lower  triangular  structure  of  L,  is  used  in  the  ITS  scheme  to 
generate  a  list  of  good  candidate  bit  sequences,  or  candidate  list.  This  is  described  next. 

A  symbol  vector  s  consists  of  N,  symbols,  each  chosen  from  an  alphabet  of  size  2Mc .  The 
set  of  all  possible  symbol  vectors  can  therefore  be  represented  by  a  tree  structure  of  depth  Nt, 
having  a  single  symbol  on  each  branch  and  2Mc  branches  out  of  each  node,  as  illustrated  in  Fig.  3.1. 
Associated  with  each  path  within  the  tree  are  a  sequence  of  symbols  si,  ■  ■  ■  ,  sj  and  a  metric  //y, 
where  d  <  N,  indicates  the  symbol  depth  of  the  path.  Every  possible  symbol  vector  corresponds  to 
a  path  to  the  maximum  symbol  depth,  N„  and  has  a  total  metric  /x(s)  =  hn,- 

The  ITS  detector  uses  a  breadth-first  tree  search  algorithm  known  as  the  M-algorithm  [14]  to 
search  for  the  best  paths  through  the  tree.  To  this  end,  at  each  symbol  depth  smaller  than  Nt,  the 
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Figure  3.1.  Example  of  a  sequential  tree  search,  for  N,  =  4,  Mc  =  2.  At  each  symbol  depth,  the  best  M  -  4 
paths  are  retained.  Deleted  paths  are  not  shown. 


algorithm  keeps  a  list  of  the  best  M  paths  found  thus  far  and  their  metrics,  and  moves  forward 
by  extending  each  of  these  paths  to  form  M  ■  2M<  new  paths.  Metrics  are  then  updated  according 
to  (3.3),  and  the  A/(2  U  -  l)  worst  paths  are  deleted.  The  heart  of  the  M-algorithm  is  a  sorting 
procedure  which  deletes  these  paths.  An  attractive  implementation  of  this  procedure  is  the  heapsort 
algorithm  [17],  Heapsort  is  particularly  suitable  if  only  a  partial  ordering  is  desired,  as  is  the  case 
here.  Furthermore,  its  complexity  is  guaranteed  to  be  0(n  log  >?),  n  being  the  number  of  input  data, 
and  is  almost  independent  of  the  distribution  of  the  input  data1.  This  is  a  favourable  property  with 
regard  to  practical  implementation. 

After  having  generated  the  candidate  list,  denoted  here  by  L,  the  ITS  detector  computes  an 
approximation  of  (2.4)  as 


£  exp  /z(s) 

xe{LnX;fi.) 

LfiXn.k)  =  log  — - T-  -  L.\{xn.k).  (3.4) 

23  exP  ri(s) 

xe<LnX“[) 

In  practice,  the  log-sum  over  exponential  functions,  which  is  a  relatively  complex  operation,  can  be 
approximated  by 

log  ^  exp  Hj  %  max  n  j  (3.5) 

j  1 

with  little  performance  degradation.  This  is  referred  to  as  the  max-log  approximation  [18]. 

Theoretically,  the  performance  of  the  ITS  detector  is  identical  to  that  of  the  MAP  detector  only 
for  the  maximum  possible  list  size,  i.e.,  for  M  =  2N,Mi.  In  practice,  however,  near-optimum  per¬ 
formance  can  often  be  achieved  when  M  is  only  a  small  fraction  of  2N,Mc.  The  efficiency  of  the 
ITS  detector  in  approaching  MAP  performance  depends  on  the  correlation  between  the  columns 
of  the  channel  matrix  H.  which  is  identical  to  the  correlation  between  the  columns  of  L  because 
H' H  =  L+L.  For  example,  if  the  columns  of  H  are  orthogonal,  so  that  L  is  diagonal,  the  search 

1 1  lea1  and  in  the  following.  Landau's  symbol  O.  which  suggests  “order",  is  used  to  indicate  that,  for  large  n,  the 
function  f(n )  =  (7  (/t(;t  >1  is  proportional  to  h(n). 
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over  each  symbol  is  independent  of  decisions  made  on  previous  symbols,  and  the  M-algorithm  is 
guaranteed  to  yield  the  M  best  candidate  symbol  vectors.  In  that  extreme  case,  the  performance  of 
the  ITS  detector  is  near-optimum  even  for  very  small  list  sizes.  In  the  other  extreme,  if  H  is  fully 
correlated,  L  degenerates  to  a  matrix  whose  elements  are  zero,  except  those  in  the  bottom  row.  In 
that  case,  the  metric  /<(s)  can  no  longer  be  computed  in  a  symbol-by-symbol  manner,  as  in  (3.3), 
and  near-optimum  performance  can  only  be  achieved  with  very  large  list  sizes,  i.e.,  with  a  nearly 
exhaustive  tree  search. 

Another  consequence  of  the  use  of  the  M-algorithm  instead  of  an  exhaustive  tree  search  is  that 
there  may  be  positions  for  which  all  bit  sequences  used  in  the  LLR  computation  (3.4)  have  the 
same  binary  value.  This  possibility  becomes  increasingly  likely  for  smaller  list  sizes,  and  is  certain 
to  occur  if  M  <  N,MC.  In  such  a  case,  (3.4)  cannot  be  evaluated  because  either  L  IT  or 
L  D  X~j,  is  empty.  Instead,  LE(xn  j,)  is  then  assigned  a  negative  or  positive  clipping  value,  L giCup, 
respectively.  It  was  found  that  the  choice  of  the  clipping  value  can  affect  the  performance  of  the 
iterative  receiver.  Ideally,  LexM p  should  be  different  for  each  channel  bit  as  follows: 


^s.ciip  =  log 


T /.;  (xnj; )  >  0} 

1  >  0} 


(3.6) 


where  Pr{.r/,.*Z,E(.v„.*)  >  0}  represents  the  probability  that  the  bit  sequences  in  L  contain  the  true 
binary  value  for  i.e.,  that  the  decision  to  drop  all  bit  sequences  with  a  different  binary  value 
was  correct.  It  can  be  shown  that  this  clipping  value  maximizes  the  mutual  information  between 
x,,.k  and  Lf, ;(.v„.*),  i.e.. 


/(£fc-(.r,.,*); *,u)  =  1  -  £{log2(l  +  e~xMx"k))}.  (3.7) 

In  practice,  however,  the  exact  value  of  Pr{jc,IJ1.L£(jc,!  *)  >  0},  which  is  dependent  on  many  fac¬ 
tors  including  signai-to-noise  ratio  (SNR)  and  channel  correlation,  is  unknown,  and  a  fixed  clipping 
value  is  used.  A  constant  clipping  value  that  is  much  lower  than  the  optimum  value  for  a  given 
Pr{.v„.*Z, *(*„.*)  >  0}  causes  the  channel  decoder  to  largely  ignore  the  clipped  detector  output  val¬ 
ues,  which  degrades  its  error-correction  effectiveness.  A  clipping  value  that  is  much  higher  than  its 
optimum  value,  on  the  other  hand,  forces  the  channel  decoder  to  assume  that  the  clipped  detector 
output  values  have  the  correct  sign.  In  the  case  that  this  assumption  is  false,  soft  decisions  on  other 
bits  must  be  compromised  in  order  to  meet  the  code  constraints,  leading  to  error  propagation.  This 
performance  degradation  for  relatively  small  or  large  clipping  values  is  illustrated  quantitatively  in 
Fig.  3.2,  which  shows  plots  of  the  mutual  information  between  jc„,*  and  the  clipped  detector  out¬ 
put  Ln(x„,k).  versus  the  clipping  value  LE.dip,  for  different  values  of  Prfx^LfCx,,^)  >  0}.  It  is 
seen  that  the  maximum  amount  of  information  contained  in  the  clipped  detector  output  values  is 
relatively  small  for  clipping  values  smaller  than  approximately  3.  The  mutual  information  also  de¬ 
creases  for  clipping  values  larger  than  approximately  2,  but  this  effect  is  limited  to  lower  values  of 
Pr {x„.kL/.;(xn.k )  >  0).  A  reasonable  balance  can  be  found  at  intermediate  clipping  values,  and  good 
results  were  obtained  with  Lex tip  =  3,  as  can  be  seen  in  Sections  3.5  and  4. 

The  complexity  per  bit  of  the  ITS  detector  is  dominated  by  the  metric  update  computation  in 
(3.3)  if  N,  is  large  enough,  and  is  therefore  0(N,).  Because  the  number  of  metric  updates  to  be 
computed  at  each  symbol  depth  in  the  tree  search  is  M  •  lMc ,  the  complexity  per  bit  depends  on  the 
signal  constellation  size  as  0(2Mc/Mc). 


3.2  Extension  for  multilevel  bit  mappings 

The  complexity  per  bit  of  the  ITS  detector  can  be  made  nearly  independent  of  Mc  with  the  aid  of  a 
special  constellation  mapping  that  is  referred  to  herein  as  a  multilevel  bit  mapping.  By  definition,  a 
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LE,clip 

Figure  3.2.  Mutual  information  I(LE(xn^ky,  x„tk)  vs.  clipping  value  LEx\ip,  for  the  values  of  Pr{xntkL  E(*n,k)  > 
0)  indicated. 
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Figure  3.3.  Example  of  a  set  of  QAM  signal  constellations  with  a  multilevel  Gray  bit  mapping.  64-QAM 
(“level  3")  signal  points  are  represented  by  black  dots.  Associated  16-QAM  (“level  2”)  and  QPSK  (“level  1”) 
constellations  are  indicated  by  gray  and  white  signal  points,  respectively. 
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QAM  signal  constellation  with  a  multilevel  bit  mapping  has  the  property  that  it  can  be  partitioned 
into  four  equal  subsets  such  that  (a)  the  maximum  Euclidean  distance  between  the  signal  points 
in  each  subset  is  minimized,  (b)  each  subset  can  be  uniquely  identified  by  the  first  two  bits  of  its 
signal  points,  and  (c)  the  remaining  Mc  —  2  bits  of  each  subset  again  form  a  multilevel  bit  mapping. 
An  example  of  a  set  of  QAM  signal  constellations  with  a  multilevel  Gray  bit  mapping  is  given  in 
Fig.  3.3,  with  intersecting  dotted  lines  separating  the  subsets  at  each  level.  This  figure  also  shows 
that,  if  the  location  of  a  higher-order,  say  64-QAM  signal  point  s  is  to  be  determined  but  only  its 
first  21  <  Mc  bits  are  known,  the  best  estimate  is  formed  by  the  corresponding  “intermediate”  signal 
point  at  level  /,  denoted  here  by  s (,).  Subsequent  pairs  of  bits  refine  this  estimate,  until  all  Mc  bits 
are  known. 

The  use  of  a  multilevel  bit  mapping  makes  it  possible  to  perform  the  tree  search  in  steps  of  two 
bits,  so  that  the  number  of  branches  emanating  from  each  node  is  four,  regardless  of  the  modulation 
order.  This  is  illustrated  in  Fig.  3.4,  which  shows  an  example  of  a  search  over  a  64-QAM  signal 
constellation  at  an  arbitrary  symbol  depth  of  the  tree  search,  for  M  =  4.  At  the  left-hand  side  of  the 
tree  section  shown,  the  path  list  contains  the  best  M  symbol  sequences  found  before  considering 
the  current  symbol.  At  the  tree  position  marked  “level  1”,  each  of  these  paths  is  extended  according 
to  the  four  possible  values  of  the  first  pair  of  bits  in  the  current  symbol,  using  the  QPSK  (4-QAM) 
constellation  in  Fig.  3.3.  The  corresponding  metrics  are  then  calculated,  and  the  best  M  paths 
are  retained.  Likewise,  at  the  position  marked  “level  2”,  all  paths  retained  at  the  previous  level 
are  extended  according  to  the  four  possible  values  of  the  subsequent  bit  pair,  using  the  16-QAM 
constellation  in  Fig.  3.3.  Again,  the  corresponding  metrics  are  calculated,  and  all  but  the  best  M 
paths  are  eliminated.  Thus,  the  search  over  the  current  symbol  constellation  continues  until  the 
maximum  level  l  =  Mc/2  is  reached.  This  modified  ITS  scheme  will  be  referred  to  in  the  remainder 
of  the  report  as  multilevel  mapping  ITS  (MLM-ITS). 

Denoting  the  metric  corresponding  to  level  /  of  symbol  depth  d  by  the  metric  update  rela¬ 
tions  for  the  MLM-ITS  scheme  can  be  written  as 


n{l)  =  -Si)|2  +  Y 


=  M?-'>  + -L 


;-i 


-sd  +  YWSj-Sj) 


7=1 


CL 


<-l 


lu(sjl)  -  Si)  +  Y  l‘j(sj  -*/)  +  Y 
;=>  1  /eif 

i  =  /  =  2,  •  •  •  ,  Mc/2 

M,<n = ^-i/2)  -  -4  -«>+£  -  sj)  +  E 


7=1 


(1) 


i  =  2,  ,N, 


H(s)  =  ^'2\ 


(3.8) 


in  which 

Jf  =  [j\  j  6  [21  -  1,  21}  and  xu  =  +l}.  (3.9) 

The  complexity  per  bit  of  MLM-ITS  is  still  0(Nt),  but  is  nearly  independent  of  the  constellation 
size,  which  is  a  major  improvement  over  the  basic  ITS  scheme.  As  will  be  shown  in  Section  3.5, 
this  complexity  reduction  does  not  affect  the  performance  of  the  detector. 
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DEPTH  d-1  DEPTH  d 

Figure  3.4.  Example  of  a  sequential  tree  search  over  a  64-QAM  constellation  with  a  multilevel  bit  mapping. 
At  each  level,  the  best  M  =  4  paths  are  retained.  Deleted  paths  are  not  shown. 

3.3  Comparison  to  BLAST 

Some  similarity  exists  between  ITS  detection  and  the  non-iterative  BLAST  nulling/cancelling  detec¬ 
tion  algorithm  described  in  [19, 20],  In  both  schemes,  the  channel  matrix  is  effectively  transformed 
to  a  triangular  form,  such  that  the  detection  of  any  given  symbol,  say  sn,  is  only  affected  by  the 
interference  from  previous  symbols  .v, ,  •  •  •  ,  s„_[  (nulling).  However,  whereas  the  BLAST  scheme 
immediately  makes  a  hard  decision  on  s„  and  subtracts  the  detected  symbol  from  the  received  sig¬ 
nal  vector  (cancelling),  the  ITS  detector  retains  and  cancels  several  possible  realizations  of  s„.  A 
hard  decision  is  possibly  forced  at  a  later  stage  of  the  tree  search,  when,  due  to  the  limited  list  size, 
all  candidate  symbol  vectors  with  a  different  realization  of  sn  are  dropped  from  the  candidate  list. 
Because  it  is  able  to  defer  hard  decisions  until  more  symbols  have  been  processed,  the  ITS  detector 
effectively  detects  multiple  symbols  jointly,  and  is  therefore  less  prone  to  error  propagation  than 
BLAST.  Also,  its  performance  is  considerably  less  sensitive  to  the  order  in  which  the  symbols  are 
detected:  simulations  have  shown  that  the  gain  in  ITS  performance  due  to  optimal  detection  order¬ 
ing.  proposed  in  [20],  would  be  0.5  dB  or  less  for  the  scenarios  considered  in  the  following  sections. 
For  a  list  size  of  one.  the  two  schemes  are  basically  identical,  although  BLAST  does  not  accept  soft 
input  information  and  requires  the  channel  code  blocks  to  be  organized  in  a  specific  (diagonally  or 
vertically  layered)  manner,  whereas  ITS  detection  permits  arbitrary  code  block  organization. 

3.4  Computational  complexity 

The  complexities  of  the  basic  ITS  and  MLM-ITS  detection  schemes  were  analyzed  by  measuring  the 
execution  times  of  their  algorithm  components.  Since  they  constitute  the  bulk  of  the  total  complex¬ 
ity.  only  the  components  that  must  be  executed  for  every  symbol  were  considered.  These  include  the 
metric  update  defined  by  (3.3)  and  (3.8),  respectively,  as  well  as  the  heapsort  sorting  routine  and  the 
LLR  computation  (3.4),  approximated  by  the  max-log  approximation.  The  complexity  associated 
with  the  computation  of  the  ML  symbol  estimate  is  negligible,  and  was  therefore  not  considered. 

In  the  complexity  analysis,  identical  assumptions  were  made  with  regard  to  the  implementation 
of  both  detection  schemes.  Care  was  taken  to  code  all  algorithm  components  efficiently,  and  the 
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resulting  C  source  code  was  compiler-optimized  for  and  run  on  the  same  Intel  Pentium  Ill-based 
platform.  The  MIMO  configurations  considered  were  4x4  and  8x8,  the  modulation  formats 
were  QPSK  (4-QAM),  16-QAM  and  64-QAM,  and  the  ITS  list  size,  M,  was  varied  from  8  to  64. 
All  measured  complexities  were  normalized  to  the  total  complexity  of  the  basic  ITS  detector  for 
4x4  QPSK  with  M  =  8.  Note  that,  for  both  schemes,  the  complexity  per  bit  is  proportional  to  the 
number  of  iterations  in  the  detector/decoder  loop. 

Table  3. 1  shows  the  normalized  measured  complexity  per  bit  of  basic  ITS  and  MLM-ITS.  It  is 
seen  that,  except  for  QPSK  modulation,  the  complexity  of  MLM-ITS  is  significantly  lower  than  that 
of  basic  ITS.  For  example,  for  8  x  8  64-QAM,  MLM-ITS  is  approximately  six  times  faster  than 
basic  ITS.  This  difference  is  due  mainly  to  the  complexity  reduction  of  the  metric  update,  although 
the  complexity  of  the  heapsort  routine  is  also  lower,  because  the  number  of  paths  to  be  sorted  in  the 
tree  search  is  reduced  from  M  ■  2Mr/Mc  to  2 M  per  bit.  For  both  ITS  schemes,  the  complexity  of 
the  metric  update  is  proportional  to  M,  while  the  heapsort  complexity  grows  somewhat  faster  than 
linearly  with  M,  as  expected  from  the  discussion  in  Section  3.1.  The  complexity  per  bit  of  the  metric 
update  increases  linearly  with  Nt,  as  expected,  but  this  dependency  is  only  weak.  This  is  especially 
true  for  basic  ITS  with  large  signal  constellations,  where  the  metric  update  complexity  is  dominated 
by  operations  that  are  independent  of  N,.  Note  that  the  ITS  complexity  does  not  depend  on  Nr.  For 
each  of  the  two  ITS  schemes,  the  overall  complexity  per  bit  is  approximately  proportional  to  M,  and 
increases  only  slightly  with  increasing  N,.  The  total  complexity  per  bit  of  basic  ITS  is  proportional 
to  the  signal  constellation  size,  whereas  that  of  MLM-ITS  is  nearly  independent  of  Mc. 

It  has  been  shown  in  [2]  that  the  performance  reduction  of  the  modified  ITS  scheme  (MLM- 
ITS)  relative  to  the  basic  ITS  scheme  does  not  affect  performance,  i.e.,  the  performance  difference 
between  basic  ITS  and  MLM-ITS  is  negligible.  This  implies  that  MLM-ITS  should  be  used  instead 
of  basic  ITS  wherever  applicable.  For  this  reason,  the  performance  results  shown  in  the  remainder 
of  this  report  were  all  generated  using  the  MLM-ITS  detector. 

3.5  Performance  under  ideal  channel  conditions 

This  section  presents  error  performance  results  obtained  from  computer  simulations  of  an  iterative 
MIMO  receiver  employing  the  MLM-ITS  detector,  under  ideal  channel  conditions,  i.e.,  perfect 
channel  estimation,  and  rapid  and  spatially  uncorrelated  fading.  The  effects  of  non-perfect  channel 
estimation,  fading  correlation  and  slow  fading  are  investigated  in  the  next  chapter. 

The  simulation  results  presented  in  the  remainder  of  this  report  were  generated  for  a  4  x  4  MIMO 
configuration  with  QPSK,  16-QAM  and  64-QAM  modulation,  with  the  multilevel  Gray  mapping 
shown  in  Fig.  3.3.  The  channel  code  is  a  turbo  code  of  rate  1/2  and  memory  2,  with  feedforward 
and  feedback  generators  5  and  7  (octal),  respectively.  Frames  of  9216  information  bits  are  fed  to 
the  channel  encoder,  and  subsequently  transmitted  over  a  block  fading  channel,  represented  by  the 
propagation  matrix  H.  In  this  section,  the  elements  of  H  are  samples  of  independent,  complex¬ 
valued.  zero-mean  Gaussian  processes,  and  thus  model  a  rich  scattering  (Rayleigh)  MIMO  channel. 
The  channel  remains  constant  over  N,Mr/2  information  bits,  or  N,MC  code  bits,  and  changes  to  a 
new,  statistically  independent  realization  between  these  blocks.  AH  interleavers  are  pseudorandom, 
and.  as  theory  predicts  that  bad  interleavers  are  rare  if  the  frame  size  is  large,  no  attempt  was  made 
to  optimize  their  design.  The  number  of  iterations  in  the  turbo  decoder  is  eight,  and  that  in  the 
detector/decoder  loop  is  four,  unless  noted  otherwise.  Except  for  some  of  the  results  in  Fig.  3.7,  the 
detector  output  is  computed  using  the  max-log  approximation  (3.5);  no  approximations  are  made 
in  the  turbo  channel  decoder.  The  clipping  value  Lex\ iP  is  equal  to  3,  as  discussed  in  Section  3.1. 
The  list  size  A/  is  varied  between  8  and  64.  The  average  SNR  at  each  receive  antenna  is  denoted  by 
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Table  3.1.  Normalized  measured  complexity  per  bit  of  basic  ITS  and  MLM-ITS  detection.  Entries  contain 
basic  ITS  and  MLM-ITS  complexities,  respectively,  separated  by  slashes. 


METRIC  UPDATE 

HEAPSORT 

LLR  COMP. 

TOTAL 

4x4  QPSK,  M  =  8 

0.6  /  0.6 

0.3/ 0.3 

0.1/ 0.1 

1 .0/1.0 

4x4  QPSK,  M  =  16 

1.2/ 1.2 

0.7/ 0.7 

0.2/ 0.2 

2.1  /  2.1 

4x4  QPSK,  M  =  32 

2.4  /  2.4 

1 .6/1.6 

0.3  /  0.3 

4.3/ 4.3 

4x4  QPSK,  M  =  64 

4.8  /  4.8 

3.4/ 3.4 

0.5/ 0.5 

8.7/ 8.7 

4x4  16-QAM,  M  =  8 

1.5/ 0.6 

0.5/ 0.3 

0.1/ 0.1 

2.1  / 1.1 

4x4  16-QAM,  Af  =  16 

3.1  /  1. 3 

1.0/ 0.7 

0.2/ 0.2 

4.3/ 2.1 

4x4  16-QAM,  M  =  32 

6.2  /  2.6 

2.1  /  1. 6 

0.3/ 0.3 

8.6/ 4.5 

4x4  16-QAM,  M  =  64 

12.5/5.2 

4.2/ 3.4 

0.5/ 0.5  • 

17.3/9.0 

4  x  4  64-QAM,  M  =  8 

6.3  /  0.7 

1.1  / 0.3 

0.1  /  0.1 

7.6/ 1.1 

4  x  4  64-QAM,  M  =  16 

12.7/1.3 

2.3/ 0.7 

0.2/ 0.2 

15.1  /  2.2 

4x4  64-QAM,  M  =  32 

25.4/2.6 

4.9/ 1.6 

0.3/ 0.3 

30.6  /  4.5 

4x4  64-QAM,  M  -  64 

50.9  /  5.2 

10.1  /3.4 

0.5  /  0.5 

61.5/9.1 

8x8  QPSK,  M  •  8 

0.7  /  0.7 

0.3/ 0.3 

0.1  /  0.1 

1.1  /1. 1 

8x8  QPSK,  M  =  16 

1.5/ 1.5 

0.7/ 0.7 

0.2/ 0.2 

2.3/ 2.3 

8x8  QPSK,  M  =  32 

2.9  /  2.9 

1 .6/1.6 

0.3/ 0.3 

4.8/ 4.8 

8x8  QPSK,  M  =  64 

5.8  /  5.8 

3.4/ 3.4 

0.5  /  0.5 

9.8  /  9.8 

8x8  16-QAM,  M  =  & 

1.6/ 0.8 

0.5/ 0.3 

0.1  /  0.1 

2.2/ 1.2 

8x8  16-QAM,  M  =  16 

3.2/ 1.5 

1.0/ 0.7 

0.2/ 0.2 

4.4/ 2.4 

8x8  16-QAM,  M  =  32 

6.4/ 3.1 

2.1  /  1. 6 

0.3/ 0.3 

8.8  /  5.0 

8x8  16-QAM,  M  =  64 

12.9/6.1 

4.2  /  3.4 

0.5  /  0.5 

17.7/10.1 

8x8  64-QAM,  Af  =  8 

6.4/ 0.8 

1.1  /  0.3 

0.1  /  0.1 

7.6/ 1.2 

8x8  64-QAM,  W  =  16 

12.8/1.5 

2.3/ 0.7 

0.2/ 0.2 

15.3/2.4 

8x8  64-QAM,  M  =  32 

25.4/3.1 

4.9/ 1.6 

0.3  /  0.3 

30.7  /  5.0 

8x8  64-QAM,  M  =  64 

51.0/6.1 

10.1  /  3.4 

0.5  /  0.5 

61.8/10.1 

Sum  of  partial  complexities  may  not  agree  with  total  complexity  due  to  independent  rounding. 
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Es/Nq  =  N,  ■  crj/a^,  where  oy  =  £{|s„  |2}  is  the  average  power  used  on  each  transmit  antenna. 

The  simulation  results  of  Fig.  3.5(a)  and  (b)  show  the  dependence  of  the  MLM-ITS  bit  error 
performance  on  the  number  of  iterations  in  the  detector/decoder  loop,  for  list  sizes  of  M  =  8  and  64, 
respectively.  It  can  be  seen  that  iterative  MIMO  systems  have  considerably  improved  performance 
relative  to  non-iterative  systems,  i.e.,  systems  for  which  the  number  of  iterations  is  one.  Most  of  the 
gain  achieved  by  iterating  the  receiver  is  obtained  in  the  second  iteration,  and  increasing  the  number 
of  iterations  further  results  in  rapidly  diminishing  gains.  For  example,  the  performance  gain  that  is 
achieved  by  increasing  the  number  of  iterations  from  3  to  6  is  smaller  than  1  dB  for  all  modulation 
formats  considered  in  Fig.  3.5.  This  observation  is  valid  both  for  M  =  8  and  M  =  64,  and  is  also 
in  agreement  with  results  reported  in  [5]  and  [4, 6],  discussed  in  Section  2.2,  even  though  the  latter 
were  obtained  with  different  detection  and  channel  coding  schemes  than  the  ones  considered  here. 
This  suggests  that  the  convergence  behaviour  is  not  very  highly  dependent  on  the  type  of  detector 
or,  in  the  case  of  MLM-ITS  detection,  its  list  size.  Instead,  “external”  factors  such  as  interleaver 
depth  appear  to  be  more  important  [12].  The  complexity  of  iterative  receivers  is  proportional  to 
the  number  of  iterations.  Based  on  the  results  of  Fig.  3.5,  it  can  be  concluded  that  a  good  balance 
between  performance  and  complexity  can  be  achieved  with  3-4  iterations. 

The  bit  error  performance  of  MLM-ITS  detection  as  a  function  of  the  list  size  M  is  shown  in 
Fig.  3.6.  As  expected,  performance  improves  if  the  list  size  M  is  increased,  in  this  case  from  8 
to  64.  However,  the  improvement  becomes  smaller  as  the  ratio  between  M  and  the  maximum  list 
size,  2n,Mc,  increases.  For  example,  the  performance  improvement  due  to  increasing  M  from  8  to 
64  is  approximately  2  dB  for  4  x  4  64-QAM  (2N,Mc  =  2  x  107),  but  negligible  for  4  x  4  QPSK 
(2n,Mc  =  256). 

The  potential  performance  degradation  due  to  the  max-log  approximation  (3.5)  in  the  compu¬ 
tation  of  the  detector  output  (3.4)  was  also  investigated.  Fig.  3.7  shows  the  bit  error  performance 
obtained  using  the  true  sum  of  exponentials  and  the  max-log  approximation.  Based  on  the  very 
small  performance  difference  observed  in  Fig.  3.7,  it  can  be  concluded  that  the  max-log  approx¬ 
imation  should  be  used  where  applicable,  because  it  eliminates  most  of  the  complexity  involved 
in  evaluating  the  detector  output,  at  almost  negligible  performance  loss.  The  fact  that  the  detec¬ 
tor  using  the  true  sum  of  exponentials  performs  slightly  worse  than  the  detector  using  the  max-log 
approximation  may  be  a  result  of  numerical  instabilities,  which  are  known  to  be  a  problem  in  the 
evaluation  of  (3.4). 
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Figure  3.5.  Error  performance  of  a  4x4  ST-BICM  MIMO  system  employing  the  MLM-ITS  detector,  for  different 
numbers  of  iterations  in  the  detector/decoder  loop,  and  for  (a)  M  =  8,  and  (b)  M  =  64.  Channel  code  is  a  turbo 
code  of  rate  1/2  and  memory  2. 
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Figure  3.6.  Error  performance  of  a  4x4  ST-BICM  MIMO  system  employing  the  MLM-ITS  detector,  for  different 
values  of  the  list  size  M.  Number  of  iterations  in  the  detector/decoder  loop  is  four.  Channel  code  is  a  turbo 
code  with  rate  1/2  and  memory  2. 


Figure  3.7.  Error  performance  of  a  4  x  4  ST-BICM  MIMO  system  employing  the  MLM-ITS  detector,  with  and 
without  using  the  max-log  approximation  in  the  computation  of  the  detector  output.  Number  of  iterations  in  the 
detector/decoder  loop  is  four.  Channel  code  is  a  turbo  code  with  rate  1/2  and  memory  2. 
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4  Issues  regarding  implementation  in  real-world  MIMO 
systems 


4.1  Channel  estimation 

So  far  in  this  report,  it  has  been  assumed  that  the  channel  matrix  H  and  the  noise  variance  cr 2  are 
known  perfectly  at  the  receiver.  In  practice,  such  channel  state  information  is  not  available,  and  H 
and  cr„2  must  be  estimated  from  the  received  signal  vector  y,  based  on  knowledge  of  the  transmitted 
symbol  vector  s.  The  most  straightforward  channel  estimation  method  is  channel  training,  which 
involves  the  transmission  of  known,  or  pilot  symbol  vectors  for  a  certain  fraction  of  the  frame 
duration,  called  the  training  overhead  factor.  Because  the  pilot  symbol  vectors  themselves  do  not 
convey  any  information,  channel  training  reduces  the  overall  data  throughput.  Another  channel 
estimation  method,  which  does  not  affect  throughput,  is  to  exploit  information  about  the  transmitted 
signal  vector  produced  by  the  detector  or  the  channel  decoder.  This  method  is  commonly  referred 
to  as  decision  feedback,  or  soft  decision  feedback  if  soft  decisions  are  involved,  which  is  usually 
the  case  in  iterative  receivers.  The  optimum  MIMO  channel  estimation  scheme,  in  the  sense  that  it 
maximizes  the  likelihood  of  the  received  signal  vector  y  given  the  transmitted  symbol  vector  s,  is 
defined  by 

H  =  R,fR;\  (4.1) 

where  Ryi  and  R*  denote  estimates  of  the  correlation  matrices  Ryj  =  £{ys+}  and  R,  =  £{sst). 
The  estimation  of  Ryi  and  R,,  and  hence  H,  by  means  of  finite-length  channel  training  and  soft 
decision  feedback,  respectively,  is  discussed  in  the  next  sections.  The  estimation  of  a2  is  addressed 
in  Section  4.1.3. 


4.1.1  Channel  training 

If  a  finite  number,  Np,  of  pilot  symbol  vectors,  denoted  by  sPii,  i  =  1,  •  •  •  ,  Np,  is  available  for 
channel  estimation,  R„  and  Rv  are  estimated  as  follows: 


R,,  =  — 

,=1 

N„ 

=  jr 


(4.2) 


i= 1 


It  is  noted  that  Np  must  be  at  least  equal  to  the  number  of  transmit  antennas  in  order  for  the  matrix 
inversion  in  (4.1)  to  exist.  It  has  been  shown  in  [21]  that,  to  minimize  the  mean  squared  channel 
estimation  error,  the  training  symbol  vectors  must  be  orthogonal  and  of  equal  power,  so  that  R, 
is  diagonal  with  equal  diagonal  elements.  It  can  be  shown  that  the  channel  estimation  accuracy, 
expressed  in  terms  of  the  mean  SNR  per  element  of  H,  can  then  be  written  as 

E{(hjj  —  hjj)(hjj  —  hjj)*}  _  Ap 

E{huh*j }  ~  Nt  No' 
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Figure  4.1 .  Error  performance  of  a  4  x  4  ST-BICM  MIMO  system  employing  the  MLM-ITS  detector  with  M  =  8, 
for  different  channel  estimation  accuracies.  Number  of  iterations  in  the  detector/decoder  loop  is  four.  Channel 
code  is  a  turbo  code  with  rate  1/2  and  memory  2. 


where  h,j  and  hjj  denote  the  (/.  y')th  element  of  H  and  H,  respectively,  and  Es/N0  is  the  SNR  at 
each  receive  antenna,  as  previously.  It  follows  from  Eq.  (4.3)  that,  if  Np  =  N, ,  the  mean  SNR  of 
the  elements  of  H  is  equal  to  Ex/N().  If  better  channel  estimation  accuracy  is  required,  the  number 
of  pilot  symbol  vectors  must  be  increased.  Alternatively,  the  transmit  power  in  the  training  interval 
could  be  increased.  However,  the  benefit  of  this  option  is  limited,  because  the  payload  power  would 
then  have  to  be  reduced  in  order  to  keep  the  overall  transmit  power  constant.  An  information- 
theoretic  analysis  in  [21]  has  shown  that,  under  typical  conditions,  the  capacity  gain  resulting  from 
the  optimization  of  training  and  payload  powers  is  only  5-10%.  Also,  this  approach  causes  some 
difficulties  concerning  the  implementation  of  the  transmitter  front  end,  most  notably  a  loss  in  power 
efficiency  due  to  the  increased  peak-to-average  power  ratio. 

In  order  to  evaluate  the  sensitivity  of  the  MLM-ITS  detector  to  channel  estimation  errors,  com¬ 
puter  simulations  were  performed  in  which  the  detector  was  provided  with  a  copy  of  H  whose 
elements  were  corrupted  by  additive  complex  Gaussian  noise.  The  average  SNR  of  the  matrix  el¬ 
ements  is  used  as  a  measure  for  the  channel  estimation  accuracy.  Simulation  results  for  a  4  x  4 
MIMO  configuration  with  M  =  8  are  shown  in  Fig.  4.1.  All  other  parameters  were  kept  the  same  as 
in  Section  3.5.  It  is  seen  that  performance  degradation  is  almost  negligible  (<  1  dB)  if  the  channel 
estimation  SNR  is  at  least  10  dB  higher  than  the  Ex/Na  required  for  “error-free”  transmission  (BER 
<  1 0-'1 )  with  perfect  channel  estimation.  For  example,  for  4  x  4  64-QAM  and  perfect  channel 
knowledge,  error-free  performance  is  achieved  at  values  of  Ex/N{)  higher  than  approximately  17 
dB.  Performance  degradation  is  approximately  1  dB  for  a  channel  estimation  SNR  of  25  dB,  which 
is  8  dB  above  Es/N0 ,  and  becomes  smaller  if  the  channel  estimation  accuracy  is  improved  further. 
For  channel  estimation  SNRs  lower  than  approximately  10  dB  above  Es/Nq,  performance  degrada¬ 
tion  increases  rapidly.  Similar  observations  can  be  made  for  the  other  curves  in  Fig.  4.1.  From  these 
results  it  can  be  concluded  that,  if  the  channel  estimation  scheme  relies  on  training  only,  the  number 
of  pilot  symbol  vectors  must  be  at  least  ten  times  higher  than  the  number  of  transmit  antennas  for 
negligible  performance  degradation,  i.e.,  Np/N,  >  10. 

In  order  to  conserve  throughput,  it  is  generally  desirable  to  limit  the  training  overhead  factor  to 
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approximately  10%  or  less,  which  can  be  achieved  only  if  the  training  period  is  at  least  an  order  of 
magnitude  shorter  than  the  channel  coherence  time,  which  is  denoted  by  Tc  and  will  be  discussed  in 
more  detail  in  Section  4.3.  It  thus  follows  that  the  symbol  vector  transmission  rate  must  be  at  least 
two  orders  of  magnitude  higher  than  the  ratio  N,/Tc,  i.e.,  the  coded  bit  rate  must  be  greater  than  ap¬ 
proximately  l00NfMc/Tc.  The  coherence  time  depends  strongly  on  the  propagation  environment, 
and  can  vary  from  approximately  I  ms  for  high-speed  mobile  applications  to  approximately  100  ms 
for  indoor  applications.  For  example,  a  4  x  4  64-QAM  MIMO  system  designed  to  operate  in  an 
indoor  environment  with  Tc  =  100  ms  must  have  a  coded  bit  rate  higher  than  approximately  100 
kbps  in  order  for  performance  loss  due  to  imperfect  channel  estimation  to  be  negligible.  If  lower 
data  rates  are  desired,  soft-decision  feedback  can  be  employed  to  enhance  channel  estimation  while 
preserving  information  throughput,  as  discussed  in  the  next  section. 

4.1.2  Soft  decision  feedback 

The  training  overhead  required  for  accurate  channel  estimation  can  be  reduced  by  exploiting  the  soft 
decisions  on  the  transmitted  data  produced  by  the  detector  or  the  channel  decoder.  Such  soft  decision 
feedback  schemes  have,  for  example,  been  proposed  in  [6,7,22],  Typically,  in  these  schemes  an 
initial  channel  estimate  is  produced  with  the  aid  of  a  relatively  small  number  of  pilot  symbol  vectors. 
In  addition.  LLR  values  pertaining  to  the  code  bits  transmitted  during  the  payload  interval  are  used  to 
generate  soft  estimates,  s,  of  the  data  symbol  vectors.  These  soft  estimates  are  then  used  to  improve 
the  estimates  of  RV(  and  R,  required  to  compute  H.  In  iterative  receivers,  the  channel  estimation 
accuracy  tends  to  improve  in  later  iterations,  as  the  soft  decisions  produced  by  the  detector  and  the 
decoder  become  more  reliable.  It  was  reported  in  [6]  that,  under  certain  conditions  including  the 
use  of  QPSK  modulation  and  convolutional  coding,  such  an  iterative  channel  estimation  scheme  can 
achieve  almost  the  same  error  performance  as  if  perfect  channel  knowledge  were  available,  although 
more  detector/decoder  iterations  are  required.  The  question  of  whether  this  conclusion  is  also  valid 
for  MIMO  systems  with  high-order  QAM  modulation,  different  coding  schemes,  arbitrary  numbers 
of  antennas,  etc.,  is  a  topic  for  further  research.  Another  question  that  is  still  open  is  whether  it  is 
better  to  use  soft  feedback  from  the  detector  or  from  the  channel  decoder.  While  the  soft  information 
produced  by  the  detector  tends  to  be  less  reliable  than  that  from  the  decoder,  it  becomes  available 
earlier  in  the  iterative  process,  and  can  be  used  to  improve  the  channel  estimate,  and  hence  detection 
reliability,  in  the  same  iteration  as  it  was  generated.  The  latter  factor  is  expected  to  outweigh  the 
former. 

A  shortcoming  of  the  soft  decision  feedback  channel  estimation  technique  outlined  above  is 
that  it  does  not  perform  well  if  no  reliable  soft  decision  feedback  is  available,  so  that  s  is  a  poor 
estimate  of  the  true  symbol  vector.  In  that  case,  taking  into  account  soft  decision  feedback  actually 
degrades  the  channel  estimate  obtained  with  channel  training  only.  This  problem  can  be  alleviated 
by  only  taking  into  account  the  soft  estimates  of  symbol  vectors  whose  LLR  values  all  exceed  a 
certain  magnitude  threshold,  as  proposed  in  [6].  However,  because  this  method  typically  ignores 
many  LLR  values  with  moderate  to  high  magnitudes,  it  does  not  exploit  the  full  potential  of  soft 
decision  feedback. 

A  possibly  more  efficient  approach  to  utilizing  soft  information  for  channel  estimation,  which 
is  quite  natural  to  the  MLM-ITS  detection  scheme,  is  to  update  the  estimates  of  the  correlation 
matrices  Rv.v  and  Rs,  and  hence  of  H,  with  the  aid  of  a  list  of  symbol  vectors  and  their  estimated 
likelihoods  of  being  the  true  symbol  vector.  As  discussed  in  Section  3.1,  a  list  of  bit  sequences 
likely  to  have  been  transmitted,  referred  to  as  the  candidate  list  and  denoted  by  L,  is  automatically 
generated  by  the  MLM-ITS  detector  in  every  iteration.  It  is  straightforward  to  compute  an  estimate 
of  the  likelihood  of  a  candidate  symbol  vector  from  the  list  of  metrics  associated  with  L.  Denoting 
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the  likelihood  of  the  symbol  vector  s,  and  hence  the  corresponding  bit  sequence  x,  by  p(s),  Ry,  and 
Rj  can  be  updated  as  follows: 


R 


N, 


ys 


Rf 


Ns  + 
Nx 


Ns  + 


Tft"  +  /^TT  y'2>(s)s’ 

xeL 

xeL 


(4.4) 


Ns  <-  Nt  +  1, 


where  Ns  is  the  number  of  symbol  periods  over  which  Ryj  and  R*  are  computed,  including  the  pilot 
symbol  intervals.  In  contrast  to  the  iterative  channel  estimation  schemes  proposed  in  [6, 7, 22],  soft 
decision  feedback  based  on  the  correlation  matrix  update  equations  (4.4)  does  not  degrade  the  initial 
channel  estimate  even  if  no  reliable  soft  information  is  available,  because  the  likelihoods  p(s)  are 
small  in  that  case.  A  more  detailed  investigation  of  this  approach,  including  an  evaluation  of  its 
performance  and  computational  complexity,  is  a  topic  for  future  research. 

Other  issues  that  need  to  be  addressed  include  the  complexity  associated  with  the  direct  matrix 
inversion  in  the  ML  channel  estimate  (4.1),  and  the  capability  to  track  channel  variations  other  than 
the  abrupt  changes  hypothesized  in  the  block  fading  channel  model  adopted  herein.  Although  this 
model  provides  a  tractable  method  to  obtain  insight  into  the  effects  of  temporal  channel  variations 
on  MIMO  performance,  it  does  not  accurately  represent  the  characteristics  of  real-world  MIMO 
channels.  Practical  radio  channels  change  with  time  continuously,  which  implies  that  their  estimates 
will  become  increasingly  unreliable,  unless  tracking  between  training  intervals  is  performed.  This 
also  implies  that  channel  information  obtained  in  previous  training  intervals  does  not  necessarily 
become  irrelevant  when  the  channel  estimate  is  updated,  but,  if  suitably  weighted,  can  be  used  to 
enhance  estimation  accuracy  [23].  Both  issues  can  possibly  be  addressed  by  modifying  the  update 
equations  (4.4)  and  approximating  (4.1)  using  a  gradient  descent  approach,  thus  avoiding  a  matrix 
inversion,  as  proposed  in  [24], 


4.1 .3  Noise  variance  estimation 

In  comparison  to  the  estimation  of  the  channel  matrix  H,  the  estimation  of  the  noise  variance  cr„2  is 
a  minor  problem,  because  it  involves  only  a  single  parameter  that  is  normally  not  rapidly  varying. 
Because  cr2  is  identical  for  each  receive  antenna,  it  can  be  written  as 

ol  =  i-tr  £{nnf}  =  ^-tr{Ry  -  R^H*  -  HRjs  +  HR^},  (4.5) 

where  Rv  =  £{yyf},  and  RVJ  and  Rf  were  defined  previously.  Replacing  all  matrices  on  the  right- 
hand  side  of  (4.5)  by  their  respective  approximations,  an  estimate  of  cr2  is  obtained  as 

^  =  ^tr{Rv-M;'RU-  (4-6) 

The  matrix  Rv  is  computed  by  averaging  yyf  over  the  same  symbol  intervals  for  which  R„  and  R* 
are  computed. 


4.2  Correlated  fading 

It  is  well  known  that,  from  a  theoretical  point  of  view,  MIMO  systems  can  achieve  maximum  ca¬ 
pacity  if  the  elements  of  the  channel  matrix  H  are  uncorrelated,  i.e.,  if  E{hklh*mn}  =  0  for  k  ^  m 
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and/or  /  n.  It  is  generally  accepted  that  this  channel  condition  occurs  if  the  antenna  elements  at 
both  the  transmitter  and  the  receiver  are  located  sufficiently  wide  apart,  and  the  physical  propagation 
environment  can  be  described  as  being  “rich  scattering”.  The  latter  term  is  used  in  a  loose  sense 
to  indicate  environments  where  many  multipath  components,  comparable  in  terms  of  path  gain  but 
widely  different  with  respect  to  their  trajectories  in  space,  are  present.  In  real-world  situations,  the 
MIMO  channel  matrix  is  often  correlated  among  its  elements  because  one  or  more  of  these  require¬ 
ments  are  not  fulfilled.  This  correlation,  often  referred  to  as  fading  correlation,  has  a  detrimental 
effect  on  information-theoretic  MIMO  capacity,  because  it  decreases  the  average  number  of  usable 
spatial  eigenmodes  and  their  “gains”,  i.e.,  the  eigenvalues  of  HHf  [25, 26].  In  addition,  as  was  dis¬ 
cussed  in  Section  3.1,  fading  correlation  degrades  the  performance  of  MLM-ITS  detection  relative 
to  optimum,  MAP  detection. 

In  practical  MIMO  scenarios,  the  spacing  between  the  elements  in  the  transmit  and  receive  ar¬ 
rays  is  usually  much  smaller  than  the  distance  between  any  pair  of  transmit  and  receive  antennas, 
or  from  any  antenna  to  the  nearest  scatterer.  In  such  cases,  the  angles-of-arrival  and  amplitudes 
of  the  received  signals  from  a  single  transmit  antenna,  and,  therefore,  the  receive  correlation,  are 
approximately  independent  of  the  antenna’s  location  within  the  transmit  array.  Likewise,  the  corre¬ 
lation  between  the  transmitted  signals  measured  at  a  single  receive  antenna  is  roughly  independent 
of  the  antenna’s  location  within  the  receive  array.  Under  these  assumptions,  fading  correlation  can 
be  separated  into  transmit  and  receive  correlation,  and  characterized  completely  by  the  transmit  and 
receive  correlation  matrices,  R,  =  £{H+H}  and  Rr  =  £{HHt},  respectively.  This  is  referred  to 
as  the  Kronecker  approximation  in  the  MIMO  literature  [27].  As  these  matrices  are  dependent  on 
the  locations  and  other  properties  of  scatterers  surrounding  the  transmitter  and  receiver,  they  are 
generally  different  from  one  propagation  environment  to  another,  and  must  be  determined  by  means 
of  measurements.  If  relevant  measurement  data  are  not  available,  simple  correlation  models  such  as 
the  uniform  correlation  model  [28] 


can  be  used.  Here,  the  correlation  coefficients  pt  and  p,  are  restricted  to  be  real-valued.  Although 
such  models  do  not  accurately  represent  the  MIMO  channel  characteristics  encountered  in  real  prop¬ 
agation  environments,  where  the  correlation  coefficients  are  generally  complex-valued  and  different 
from  one  another,  they  are  useful  for  obtaining  insight  into  the  effects  of  fading  correlation  on  the 
performance  of  MIMO  systems. 

A  commonly  used  correlated  Rayleigh  fading  MIMO  channel  model  was  proposed  in  [29, 30] 
and  is  given  by 

h  =  ^w'r;/2gr;/2-  (48) 

where  G  is  an  Nr  x  Nt  random  matrix  with  independent,  zero-mean,  unit-variance  complex  en¬ 
tries  drawn  from  a  complex  Gaussian  distribution,  and  the  matrix  square  root  is  defined  such  that 
R1/2R*/2  =  R.  The  normalization  factor  in  (4.8)  ensures  that  the  entries  of  H  have  unit  variance. 
Note  that  the  channel  model  above  is  only  valid  for  real-valued  correlation  matrices.  For  simulation 
purposes  it  is  further  noted  that,  even  if  (4.8)  were  modified  to  be  capable  of  dealing  with  complex¬ 
valued  Rr  and  Rr,  one  has  to  be  careful  that  these  correlation  matrices  must  be  positive-definite  in 
order  to  be  meaningful. 
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4X4  MIMO,  MEAN  CAPACITY 


Figure  4.2.  Mean  MIMO  channel  capacity  for  various  degrees  of  fading  correlation.  Channel  code  rate  is  1/2. 

Like  the  uniform  correlation  model  described  above,  the  channel  model  (4.8)  is,  in  general, 
not  an  accurate  representation  of  real-world  MIMO  channels,  whose  channel  coefficients  are  not 
necessarily  Rayleigh  fading.  Moreover,  the  properties  of  actual  MIMO  channels  are  not  completely 
determined  by  the  transmit  and  receive  fading  correlation.  For  example,  it  has  been  pointed  out 
in  [31]  that  the  rank  of  the  channel  matrix  can  be  consistently  equal  to  one  even  when  its  entries 
are  uncorrelated,  i.e.,  when  both  R,  and  Rr  are  diagonal.  In  such  cases,  spatial  diversity  is  still 
available,  but  only  a  single  spatial  eigenmode  is  available  for  data  transmission,  as  in  single-antenna 
communications.  This  degenerate  channel  phenomenon  is  referred  to  as  the  keyhole  (sometimes: 
pinhole)  channel.  Statistical  channel  models  which  model  the  fading  of  the  entries  of  H  as  a  product 
process  of  two  complex  Gaussian  distributions  include  the  correlated  Rayleigh  fading  model  (4.8) 
and  the  keyhole  channel  as  special  cases  [32],  and  are  therefore  more  general. 

To  investigate  the  effects  of  fading  correlation  on  the  performance  of  the  MLM-ITS  detector,  a 
MIMO  channel  was  simulated  with  the  aid  of  (4.8)  and  the  uniform  correlation  model.  The  transmit 
and  receive  correlation  coefficients,  p,  and  pr,  were  varied  from  0  to  0.8.  In  order  to  provide  a 
theoretical  reference  in  terms  of  the  performance  loss  associated  with  fading  correlation,  the  mean 
MIMO  channel  capacity  [25] 

C  =  £jlog2det^I  +  ^^HH+^j  (4.9) 

was  evaluated  as  a  function  of  EJNq  for  the  same  values  of  p,  and  pr,  and  the  results  for  a  4  x  4 
MIMO  configuration  are  shown  in  Fig.  4.2.  These  plots  were  obtained  by  averaging  over  1000  inde¬ 
pendent  realizations  of  H,  which  is  sufficient  for  a  very  good  approximation  of  the  true  mean  capac¬ 
ity.  The  information-theoretic  performance  loss  relative  to  uncorrelated  fading  can  be  determined 
from  this  figure,  and  is  given  in  Table  4.1.  The  values  in  this  table  were  obtained  assuming  that 
the  channel  code  rate  is  1/2.  and  that  the  coded  information  is  split  into  N,  equal-rate  substreams, 
so  that  the  overall  information  throughput  is  N,Mc/2  bits/s/Hz.  As  discussed  in  Section  3.5,  these 
assumptions  are  consistent  with  the  simulations  of  the  MLM-ITS  detector  presented  in  this  report. 
It  can  be  seen  from  Fig.  4.2  and  Table  4. 1  that  the  theoretical  performance  loss  due  to  fading  corre¬ 
lation  is  relatively  small  for  correlation  coefficients  up  to  0.4,  but  can  become  substantial  for  larger 
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Table  4.1.  Information-theoretic  performance  loss,  in  dB  relative  to  uncorrelated  fading,  for  various  degrees 
of  fading  correlation.  Channel  code  rate  is  1/2. 


4x4  QPSK 

4x4  16-QAM 

4x4  64-QAM 

Pi  =  0.4,  pr  =  0 

0.3 

0.6 

0.6 

p,  =  0,  pr  =  0.4 

0.3 

0.6 

0.6 

■I 

o 

-i 

n 

o 

4*. 

0.7 

1.1 

1.3 

p,  =  0.8,  pr  =  0 

1.9 

2.9 

3.3 

p,  =  0,  pr  =  0.8 

1.9 

2.9 

3.3 

pi  =  0.8,  pr  =  0.8 

3.6 

5.5 

6.4 

values.  For  example,  the  performance  degradation  of  4  x  4  64-QAM  is  1.3  dB  for  p,  =  pr  =  0.4, 
and  6.4  dB  for  p,  =  pr  —  0.8.  It  is  noted  that  the  effects  of  transmit  and  receive  correlation  are 
identical.  This  is  explained  by  the  fact  that  the  capacity  formula  (4.9)  remains  identical  if  the  ma¬ 
trix  product  HHT,  which  represents  fading  correlation  at  the  receiver,  is  replaced  by  H+H,  which 
represents  correlation  at  the  transmitter. 

The  error  performance  of  the  MLM-ITS  detector  in  a  correlated  Rayleigh  fading  channel  was 
investigated  by  means  of  simulations.  As  previously,  the  fading  correlation  was  modelled  by  the 
uniform  correlation  model,  and  p,  and  pr  were  varied  from  0  to  0.8.  The  effects  of  transmit  and 
receive  correlation  on  system  performance  were  found  to  be  identical,  as  expected  from  theory,  and 
will  therefore  not  be  addressed  separately.  Performance  results  for  varying  p,  and  uncorrelated  fad¬ 
ing  at  the  receiver  are  shown  in  Fig.  4.3.  It  is  seen  that  the  performance  degradation  is  approximately 
1  dB  for  p,  =  0.4.  and  in  the  range  from  3.5  to  6  dB  for  p,  =  0.8.  These  values  are  considerably 
higher  than  the  theoretical  losses  due  to  reduced  channel  capacity,  listed  in  Table  4.1.  The  difference 
between  the  theoretical  and  actually  observed  performance  loss  can  be  attributed  to  the  reduced  effi¬ 
ciency  of  the  MLM-ITS  detector  in  approaching  MAP  performance  as  the  channel  matrix  becomes 
more  correlated,  as  discussed  in  Section  3.1.  Results  for  fading  correlation  at  both  the  transmitter 
and  the  receiver  are  shown  in  Fig.  4.4.  As  expected,  performance  loss  in  this  case  is  higher  than  if 
the  fading  at  either  end  of  the  channel  is  uncorrelated.  Again,  due  to  the  reduced  performance  of 
the  MLM-ITS  detector  relative  to  MAP  detection,  the  observed  losses  are  considerably  higher  than 
the  theoretical  values  of  Table  4.1. 

It  is  clear  from  the  above  discussion  that  the  degree  of  fading  correlation  has  great  impact  on 
the  performance  of  the  MLM-ITS  detector  and,  indeed,  any  MIMO  detection  scheme.  Performance 
degradation  of  MLM-ITS  detection  with  respect  to  the  theoretical  capacity  bound  is  low  for  low 
fading  correlation,  but  increases  dramatically  if  correlation  approaches  its  maximum.  Correlation 
can  sometimes  be  reduced  by  optimizing  the  transmitter  and  receiver  locations  such  as  to  maximize 
the  number  and  angular  spread  of  the  outgoing  and  incident  multipath  waves,  or  by  increasing  the 
spacing  between  the  elements  of  the  antenna  arrays.  An  alternative  approach  to  reducing  the  impact 
of  high  fading  correlation  on  ITS-based  MIMO  systems  is  to  apply  relative  delay  offsets  to  the 
symbol  streams  transmitted  from  different  antennas.  This  is  discussed  further  in  Section  4.4. 


4.3  Slow  fading 

The  simulation  results  discussed  so  far  in  this  report  were  obtained  with  a  channel  model  in  which 
the  channel  matrix  changes  to  a  new,  statistically  independent  realization  after  every  symbol  pe¬ 
riod,  i.e.,  after  every  block  of  N,MC  code  bits.  Because  channel  coding  and  interleaving  across 
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Figure  4.3.  Error  performance  of  a  4  x4  ST-BICM  MIMO  system  employing  the  MLM-ITS  detector  with  M  =  8, 
for  different  degrees  of  transmit  fading  correlation.  Number  of  iterations  in  the  detector/decoder  loop  is  four. 
Channel  code  is  a  turbo  code  with  rate  1/2  and  memory  2. 


Figure  4.4.  Error  performance  of  a  4  x4  ST-BICM  MIMO  system  employing  the  MLM-ITS  detector  with  M  =  8, 
for  different  degrees  of  transmit  and  receive  fading  correlation.  Number  of  iterations  in  the  detector/decoder 
loop  is  four.  Channel  code  is  a  turbo  code  with  rate  1/2  and  memory  2. 
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Figure  4.5.  Error  performance  of  a  4  x  4  ST-BICM  MIMO  system  employing  the  MLM-ITS  detector  with  M  -  8, 
for  different  channel  fading  rates.  Number  of  iterations  in  the  detector/decoder  loop  is  four.  Channel  code  is  a 
turbo  code  with  rate  1/2  and  memory  2. 


rapid  channel  variations  yields  high  temporal  diversity,  the  performance  results  obtained  with  this 
model  tend  to  be  very  good.  Practical  wireless  systems  are  typically  designed  such  that  the  channel 
coherence  time  Tc,  which  is  the  duration  for  which  the  channel  remains  unchanged  for  practical 
purposes,  is  large  compared  to  the  symbol  period.  This  ensures  that  an  accurate  channel  estimate 
can  be  obtained  in  the  training  interval,  which  remains  reliable  during  the  transmission  of  a  large 
number  of  payload  symbol  vectors.  If  the  coherence  time  is  comparable  to  or  larger  than  the  frame 
duration,  Ts ,  however,  performance  is  degraded  because  no  temporal  diversity  is  available.  In  that 
case,  channel  realizations  with  few  usable  spatial  eigenmodes  [25, 26]  are  not  compensated  for  by 
better  channel  realizations,  and  the  probability  of  error  is  lower-limited  by  the  probability  that  the 
instantaneous  channel  does  not  support  the  data  throughput. 

The  dependency  of  the  error  performance  of  MLM-ITS  detection  on  the  number  of  constant- 
channel  blocks  per  frame,  i.e.,  on  the  ratio  Tf/Tc,  was  investigated  by  means  of  simulations.  Perfect 
channel  estimation  and  uncorrelated  fading  were  assumed.  Fig.  4.5  shows  results  of  these  simula¬ 
tions  for  values  of  Tc  equal  to  Tf/ 64,  T//16  and  T//4.  The  error  performance  corresponding  to  the 
rapid  fading  case  of  Section  3.5,  for  which  Tc  =  Ts,  is  shown  for  reference.  Considerable  perfor¬ 
mance  loss  relative  to  Tc  =  Ts,  ranging  from  approximately  3  dB  for  4  x  4  QPSK  to  well  over  5  dB 
for  4  x  4  64-QAM,  is  observed  for  Tc  =  7//4.  For  Tc  =  Tf/ 16,  i.e.,  16  constant-channel  blocks 
per  frame,  performance  degradation  is  limited  to  approximately  I  dB  for  4  x  4  QPSK  and  approxi¬ 
mately  2  dB  for  4  x  4  64-QAM.  Finally,  for  Tc  =  Tf/  64,  performance  degradation  is  smaller  than  1 
dB  for  all  cases  considered  here.  It  is  also  seen  in  Fig.  4.5  that  the  rate  at  which  the  bit  error  rate  de¬ 
creases  with  Es/ No  is  smaller  for  configurations  with  lower  diversity  order,  which  is  a  well-known 
phenomenon  [33],  In  general,  it  can  be  concluded  that  sensitivity  to  a  lack  of  temporal  diversity 
increases  as  the  signal  constellation  size  becomes  larger.  In  order  to  keep  performance  degradation 
small,  i.e.,  smaller  than  1  dB,  the  frame  duration  should  be  chosen  between  one  and  two  orders  of 
magnitude  longer  than  the  coherence  time  of  the  channel. 

It  is  clear  from  the  above  discussion  that,  for  a  given  firame  duration  and,  hence,  receiver  delay, 
the  performance  of  MIMO  systems  with  perfect  channel  knowledge  degrades  if  the  channel  coher- 
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ence  time  increases.  In  real-world  MIMO  systems,  however,  increasing  coherence  time  enables 
more  accurate  channel  estimation,  which  contributes  to  better  error  performance.  If  soft-decision 
feedback  is  used,  this  does  not  necessarily  come  at  the  cost  of  increased  channel  training  overhead. 
It  is  therefore  not  possible  to  draw  specific  conclusions  with  regard  to  the  effect  of  Tc  on  the  per¬ 
formance  of  MIMO  systems,  and  in  particular  the  MLM-ITS  detector,  before  a  suitable  channel 
estimation  and  tracking  scheme  has  been  developed.  As  discussed  in  Section  4.1,  this  is  the  subject 
of  ongoing  research.  For  a  given  coherence  time,  on  the  other  hand,  it  is  generally  advisable  to 
choose  the  frame  duration  7/,  and  hence  the  interleaving  delay,  as  long  as  can  be  tolerated  within 
system  requirements.  The  maximum  interleaving  delay  that  can  be  tolerated  in  practice  depends 
on  the  application,  but  is  typically  in  the  range  from  10  to  100  ms.  In  indoor  environments,  where 
the  channel  coherence  time  can  be  as  high  as  100  ms,  considerable  performance  degradation  should 
be  anticipated  unless  alternative  diversity  techniques  such  as  frequency  diversity,  discussed  in  Sec¬ 
tion  4.5,  can  be  used. 


4.4  Asynchronous  reception 

Literature  on  MIMO  has  hitherto  dealt  almost  exclusively  with  synchronous  reception,  in  which 
signals  transmitted  from  different  antennas  are  synchronized  at  the  receiver.  Generally,  it  is  implic¬ 
itly  assumed  that  the  antennas  in  the  transmit  and  receive  arrays  are  located  sufficiently  close  to  one 
another  that  the  maximum  difference  in  propagation  delay  between  each  transmit/receive  antenna 
pair  is  very  small  compared  to  the  symbol  period.  This  means  that  the  maximum  coded  bit  rate 
must  be  smaller  than  approximately  N,Mccq/As ,  where  A.v  is  the  maximum  separation  between 
any  two  transmit  or  receive  antennas,  and  c0  is  the  speed  of  light.  For  example,  a  4  x  4  64-QAM 
MIMO  system  with  maximum  array  dimensions  of  1  m  can  be  operated  synchronously  as  long  as 
the  coded  bit  rate  is  much  lower  than  approximately  7  Gbps,  which  is  clearly  not  a  limitation  for 
current  wireless  communication  systems.  However,  asynchronous  scenarios  can  occur,  for  example, 
in  the  uplink  of  multiuser  systems  or  in  the  downlink  of  networks  with  geographically  distributed 
transmit  arrays.  In  such  scenarios,  existing  MIMO  detection  schemes,  including  (MLM-)ITS,  can 
no  longer  be  applied. 

A  scheme  called  asynchronous  iterative  trellis  search  (A-ITS)  detection,  which  is  suitable  for 
asynchronous  scenarios,  was  proposed  and  evaluated  in  [34],  Although  the  assumption  of  small 
separation  between  receive  antennas  was  still  made  in  [34],  it  is  straightforward  to  modify  the  A- 
ITS  detector  in  order  to  accommodate  arbitrarily  located  transmit  and  receive  antennas.  The  A-ITS 
detection  scheme  is  similar  to  MLM-ITS  detection  in  the  sense  that  it  employs  the  M-algorithm 
in  order  to  search  for  the  symbol  sequences  most  likely  to  have  been  transmitted,  given  the  re¬ 
ceived  signal  and  any  available  a  priori  information.  However,  as  its  name  indicates,  this  search  is 
performed  over  a  trellis  instead  of  a  tree  structure.  In  the  case  of  synchronous  reception,  the  perfor¬ 
mance  of  the  A-ITS  detector  is  identical  to  that  of  the  MLM-ITS  detector.  Interestingly,  it  has  been 
shown  that,  in  asynchronous  scenarios,  the  performance  of  the  A-ITS  detector  can  be  considerably 
better  than  in  the  synchronous  case.  This  improvement  can  be  explained  by  the  fact  that  the  correla¬ 
tion  between  symbols  transmitted  from  different  antennas  is  the  result  of  both  spatial  and  temporal 
correlation.  Temporal  correlation,  which  is  determined  by  the  “overlap”  between  the  time  intervals 
in  which  the  different  symbols  are  received,  is  at  its  maximum  in  the  case  of  synchronous  reception, 
and  hence  decreases  when  reception  becomes  asynchronous. 

While  spatial  correlation  is  a  consequence  of  the  physical  attributes  of  the  channel  and  the 
antennas,  temporal  correlation  can  be  controlled  by  adjusting  the  relative  timing  of  the  spatially 
multiplexed  symbol  streams  at  the  transmitter.  It  can  therefore  be  concluded  that  the  performance 
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of  MIMO  systems  can  be  improved  considerably  by  purposely  applying  delay  offsets  to  the  parallel 
data  streams.  This  has  indeed  been  demonstrated  in  [34].  The  moderately  higher  complexity  asso¬ 
ciated  with  asynchronous  detection  is  expected  to  be  far  outweighed  by  this  performance  improve¬ 
ment.  This  technique  may  be  especially  useful  in  the  presence  of  highly  correlated  fading,  where 
even  small  decreases  in  symbol  correlation  can  lead  to  considerable  performance  improvement,  as 
discussed  in  Section  4.2.  Furthermore,  this  technique  eliminates  the  requirement  of  (MLM-)ITS 
detection  that  Nr  >  Nt,  which  is  necessary  for  the  transformation  of  the  channel  matrix  in  trian¬ 
gular  form,  as  discussed  in  Section  3.  In  A- ITS  detection,  it  is  guaranteed  that  the  channel  matrix 
relating  the  received  signals  to  the  transmitted  symbols  can  be  transformed  to  a  band-limited  lower 
triangular  matrix  as  long  as  symbols  are  not  fully  correlated.  This,  in  turn,  is  guaranteed  if  the  data 
substreams  transmitted  from  different  antennas  are  not  synchronized  at  the  receiver. 

4.5  Frequency-selective  fading 

So  far  it  has  been  assumed  that  the  MIMO  channel  is  frequency  non-selective  over  the  system 
bandwidth,  as  reflected  in  the  channel  model  (2.3).  In  real-world  MIMO  systems,  however,  the 
delay  spread  in  the  channel  may  be  comparable  to  or  larger  than  the  symbol  period  Ts,  which  leads  to 
inter-symbol  interference.  In  practice,  the  delay  spread  typically  ranges  from  approximately  100  ns 
in  confined  indoor  environments  to  approximately  10  /rs  in  hilly  outdoor  areas.  This  means  that 
frequency-selective  fading  becomes  significant  at  coded  bit  rates  higher  than  approximately  N,MC 
times  0. 1— 10  MHz.  depending  on  the  application.  For  example,  a  4  x  4  64-QAM  indoor  MEMO 
system  would  be  considered  wideband  for  coded  bit  rates  higher  than  approximately  2  Mbps.  The 
basic  ITS  and  MLM-ITS  detectors  described  in  Section  3  are  not  directly  applicable  in  wideband 
channels. 

One  approach  to  dealing  with  frequency-selective  fading  is  to  divide  the  system  bandwidth 
up  into  smaller  subbands,  each  of  which  smaller  than  the  coherence  bandwidth,  which  is  usually 
defined  as  the  inverse  of  the  delay  spread.  The  coded  and  interleaved  bit  stream  is  then  serial- 
to-parallel  converted  into  parallel  streams,  each  of  which  is  transmitted  in  a  different  subband. 
Transmission  and  reception  in  each  subband  proceeds  in  a  manner  identical  as  in  the  narrowband 
case  of  Section  3.  This  approach  has  been  adopted  in  systems  that  are  referred  to  as  discrete  matrix 
multitone  (DMMT)  and  MIMO-OFDM  [35-37].  Its  main  advantages  are  that  it  can  be  used  in 
conjunction  with  narrowband  MIMO  detection  schemes  such  as  MLM-ITS  detection,  and  has  high 
robustness  against  frequency-selective  fading  and  narrowband  interference.  Because  the  fading 
in  the  different  subbands  tends  to  be  uncorrelated,  this  approach  exploits  the  frequency  diversity, 
sometimes  also  referred  to  as  multipath  diversity,  inherent  in  wideband  channels.  Known  practical 
problems  with  multi-carrier  techniques  include  the  need  for  extensive  channel  training  and  relatively 
long  guard  intervals,  as  well  as  the  high  peak-to-average  power  ratio  of  the  combined  signal,  which 
reduces  the  efficiency  of  the  power  amplifier. 

An  alternative,  single-earner  approach  to  dealing  with  frequency-selective  fading  is  to  have  the 
detection  scheme  compensate  for  its  effects  in  the  time  domain,  using  so-called  adaptive  equaliza¬ 
tion.  A  wideband  MIMO  channel  is  characterized  in  the  time  domain  by  multiple  channel  matrix 
taps,  H (/),  /  =  0.  •  •  •  ,  L,  where  the  integer  number  L  represents  the  maximum  time  delay  spread, 
normalized  to  the  symbol  period  Ts.  The  received  signal  is  given  by 

L 

y(fc)  =  £H(/)s(£-/)  +  nOt),  (4.10) 

/=o 

where  bracketed  indices  are  now  used  to  indicate  the  time-dependent  nature  of  the  variables  y,  H, 


DRDC  Ottawa  TR  2003-242  CRC-RP-2003-009 


29 


s  and  n.  Note  that,  apart  from  the  time  indices,  (4. 10)  with  L  =  0  is  identical  to  the  narrowband 
channel  model  (2.3).  For  L  >  0,  the  received  signal  vector  depends  not  only  on  the  symbol  vector 
transmitted  in  the  current  symbol  interval,  but  also  on  previously  transmitted  symbol  vectors.  As 
a  consequence  of  this  non-zero  channel  memory  length,  high-likelihood  symbol  vectors  cannot  be 
found  by  means  of  a  tree  search,  as  in  narrowband  MLM-ITS  detection.  Instead,  a  non-exhaustive 
trellis  search  can  be  employed,  similar  to  that  in  asynchronous  MIMO  detection  [34],  Although  the 
maximum  number  of  state  transitions  to  be  searched  for  each  symbol  would  be  no  less  than  2LN,Mc, 
it  is  expected  that  a  reduced-state  trellis  search  would  generally  provide  good  results,  especially 
since  the  matrix  taps  tend  to  be  statistically  independent,  which  results  in  a  frequency  diversity 
gain.  A  more  detailed  evaluation  of  trellis-search  based  adaptive  equalization  for  MEMO,  possibly 
in  conjunction  with  asynchronous  MEMO  detection,  is  part  of  ongoing  research. 
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5  Discussion  and  conclusions 


This  report  has  addressed  the  performance  of  iterative  MIMO  detection  under  real-world  conditions 
including  imperfect  channel  knowledge  and  spatially  correlated  and  slow  fading.  Approaches  to 
dealing  with  these  non-ideal  characteristics,  as  well  as  other  real-world  issues  such  as  frequency- 
selective  fading  and  non-synchronized  reception  were  also  addressed.  The  results  of  this  study  are 
relevant  for  the  implementation  of  real-world  ST-BICM  MIMO  wireless  systems.  Although  the 
results  presented  herein  are  based  on  the  (MLM-)ITS  detection  scheme,  they  are  expected  to  be 
typical  of  most  iterative  MIMO  detection  schemes.  The  main  conclusions  of  this  report  are  related 
to  channel  estimation  and  the  modification  of  the  ITS  scheme  to  enable  wideband  and  asynchronous 
reception,  as  discussed  next. 


5.1  Channel  estimation 

The  assumption  of  availability  of  perfect  channel  knowledge  is  the  single  least  realistic  one  in  the 
presentation  of  ITS  detection  in  Section  3,  and  the  development  of  a  suitable  MIMO  channel  estima¬ 
tion  scheme  is  a  necessity  in  the  implementation  of  any  practical  MIMO  system.  Channel  estimation 
by  means  of  the  transmission  and  processing  of  pilot  symbol  vectors  and  the  use  of  soft  decision 
feedback  was  discussed  in  Section  4.1.  The  simulation  results  in  this  section  show  that,  if  the  coded 
bit  rate  is  higher  than  approximately  lOONf Mc/Tc,  channel  training  alone  is  sufficient  to  obtain  the 
same  performance  as  with  perfect  channel  knowledge.  If  the  channel  is  rapidly  changing,  i.e.,  if  the 
coded  bit  rate  is  lower  than  approximately  100 N}MC/TC,  a  more  sophisticated  channel  estimation 
scheme  is  required.  As  discussed  in  Section  4.1.2,  feedback  of  the  soft  information  generated  by 
the  detector  is  probably  the  most  promising  approach  in  that  case.  It  is  expected  that  it  is  possible  to 
exploit  a  typical  attribute  of  ITS  detection,  namely  the  fact  that  it  generates  a  list  of  “good”  symbol 
vectors  and  their  log-likelihoods,  to  improve  on  existing  soft-decision  feedback  schemes.  Whereas 
existing  schemes  degrade  system  performance  if  reliable  soft  feedback  is  absent,  the  new  scheme 
would  weigh  its  channel  estimation  update  based  on  the  detection  reliability  of  the  current  symbol 
vector.  Channel  estimation  accuracy  with  soft  decision  feedback  can  thus  not  be  worse  than  without. 

The  main  drawback  of  the  use  of  soft-decision  feedback  is  its  relatively  high  complexity,  which 
is  partly  caused  by  the  processing  of  a  list  of  symbol  vectors  instead  of  a  single,  known  pilot  symbol 
vector,  and  also  by  the  matrix  inversion  that  is  required  to  compute  the  channel  estimate.  It  may  be 
possible  to  reduce  the  complexity  of  the  former  task  by  limiting  the  number  of  candidate  symbol 
vectors  from  which  the  channel  is  estimated.  The  matrix  inversion  can  possibly  be  avoided  by 
directly  updating  the  channel  estimate  using  a  gradient  descent  approach,  as  in  [24], 

5.2  Enabling  wideband  and  asynchronous  reception 

In  certain  high  data  rate  applications,  where  multipath  is  severe  or  the  separation  between  array 
elements  is  very  large,  propagation  delay  differences  between  multipath  components  or  different 
pairs  of  transmit  and  receive  antennas  can  become  significant  with  respect  to  the  symbol  period. 
As  discussed  in  Sections  4.4  and  4.5,  the  (MLM-)ITS  detection  scheme  can  not  be  applied  in  such 
situations.  Instead,  a  trellis  search  based  scheme  referred  to  as  A-ITS  [34]  could  then  be  used.  This 
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scheme  is  in  many  respects  similar  to  MLM-ITS,  but  is  capable  of  dealing  with  received  symbols 
that  overlap  with  both  previously  and  subsequently  received  symbols. 

In  addition  to  mitigating  the  adverse  effects  of  frequency-selective  fading,  it  is  expected  that  the 
A-ITS  detection  scheme  can  exploit  the  frequency  diversity  available  in  such  channels,  and  there¬ 
fore  provide  enhanced  performance.  This  is  particularly  desirable  in  indoor  applications,  where,  as 
discussed  in  Section  4.3,  temporal  diversity  gain  is  typically  low.  It  has  been  shown  in  [34]  that 
considerable  performance  improvement  relative  to  MLM-ITS  can  even  be  achieved  in  narrowband 
channels,  by  purposely  applying  delay  offsets  to  the  symbol  streams  transmitted  from  different  an¬ 
tennas.  This  improvement  is  due  to  the  lower  symbol  correlation  resulting  from  the  smaller  temporal 
overlap  between  symbols,  which  is  known  to  improve  the  efficiency  of  the  breadth-first  trellis  search 
employed  in  both  MLM-ITS  and  A-ITS  detection.  It  is  expected  that  asynchronous  MLMO  recep¬ 
tion  is  especially  beneficial  in  the  case  of  severe  spatial  fading  correlation,  to  which  MLM-ITS  is 
vulnerable,  as  discussed  in  Section  4.2.  Furthermore,  the  use  of  A-ITS  eliminates  the  limitation  that 
the  number  of  receive  antennas  should  be  at  least  equal  to  the  number  of  transmit  antennas,  which 
makes  the  scheme  more  widely  applicable,  for  example  in  transmit  diversity  systems. 

A  potential  drawback  of  A-ITS  detection  is  its  moderately  higher  complexity.  However,  like 
MLM-ITS,  A-ITS  offers  the  possibility  to  trade  off  improved  performance  for  lower  complexity.  It 
is  expected  that  the  complexity  increase  associated  with  asynchronous  detection  is  far  outweighed 
by  the  performance  gain  and  other  advantages  mentioned  above.  Other  issues  that  need  to  be  ad¬ 
dressed  concerning  A-ITS  detection  include  the  question  whether,  and  how  much,  oversampling  is 
required  in  order  to  deal  with  asynchronously  received  signals,  as  well  as  the  problem  of  wideband 
and  asynchronous  channel  estimation. 
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Throughout  this  report,  the  notation  s  is  used  to  indicate  an  estimate  of  .v.  Upper  and  lower-case  bold 
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List  of  abbreviations  and  acronyms 


A-ITS 

asynchronous  ITS 

APP 

a  posteriori  probability 

BCJR 

Bahl-Cocke-Jelinek-Raviv  (authors  of  BCJR  algorithm) 

BER 

bit  error  rate 

BLAST 

Bell  Labs  layered  space-time 

CRC 

Communications  Research  Centre  Canada 

DMMT 

discrete  matrix  multitone 

DND 

Department  of  National  Defence 

DRDC 

Defence  R&D  Canada 

ITS 

iterative  tree/trellis  search 

LLR 

log-likelihood  ratio 

MAP 

maximum  a  posteriori 

MIMO 

multiple-input  multiple-output 

ML 

maximum  likelihood 

MLM-ITS 

multilevel  mapping  ITS 

OFDM 

orthogonal  frequency  division  multiplexing 

QAM 

quadrature  amplitude  modulation 

QPSK 

quarternary  phase-shift  keying 

SC-MMSE 

soft-cancellation  minimum  mean  squared  error 

SNR 

signal-to-noise  ratio 

ST-BICM 

space-time  bit- interleaved  coded  modulation 
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